• Home
  • About TechTalk
  •  

    API: Something new, something faster…

    16 April 2009

    Last night, our latest Market Package with enhancements, fixes and improvements was released. It contains one new API feature, as well as a vast improvement of the speed of another.

    As for the new feature, it is now possible to login (‘Connect‘ in API terms) as an administrator. This facilitates integration with external accounting applications, e.g. reporting tools.

    To utilize this feature, instead of ‘logging in’ by using the Connect() method, you use the ConnectAsAdministrator() method, which takes the administrator user’s credentials (administrator agreement number, user ID and password), as well as the agreement number of a client to which the administrator user has access, as parameters. Note that the client in question of course still needs to have the API add-on module enabled.

    As for optimizations, we were recently made aware of some serious bottlenecks in our GetDataArray() methods when large registers are downloaded. Somewhat embarrassing, since we’re constantly pushing the data array methods as the recommended way of bulk reading/writing of data :-(

    After digging deeper into this, we discovered that the culprit was our ORM tool’s extremely conservative approach to checking for ‘dirty objects’ in its cache. While you’d expect the execution time for a GetDataArray() call to be proportional to the number of objects retrieved, in practice, it turned out to be polynomial instead! Not exactly what you’d want when downloading product registers in the thousands…

    While we could conceivably implement a generic workaround directly in our ORM tool, for the sake of risk minimizing, we have instead opted to implement bespoke optimizations on a per-object level – starting with the objects our logs show us to be the most frequently used.

    Last night’s release includes this optimization for the IProduct object – in practice, the Product_GetDataArray() method, which now handles upwards of a couple of thousand objects in a few seconds. The relative speed improvement increases with the number of objects – we’ve measured the speed increase to a factor of 10+ with 2,000 products.

    Our next Market Package release, due in mid-May, will include similar optimizations for a lot of other objects (IDebtor, ICreditor, IAccount, IInvoice, IInvoiceLine, etc.). As for all these registers, we still recommend ‘paging’ to approximately 500 objects for each call to its GetDataArray() method.

    For more detailed technical API documentation, take a look here.


    e-conomic up and running again

    7 April 2009

    e-conomic is now back online.

    For the record: While the e-conomic system has been unavailable, no data has been lost or compromised.

    We deeply regret the inconveniences caused by today’s downtime.


    UPDATE 2: Power outage

    7 April 2009

    Power has now been restored, but unfortunately, we’re still experiencing connectivity problems.

    Engineers are working hard to resolve this issue, and we will keep you informed of progress.


    UPDATE: Power outage

    7 April 2009

    The latest update from the power company is that they expect power to be returned to Hellerup, Copenhagen, around 5pm CET today.

    Again, we deeply regret the inconveniences this is causing you.


    Power outage

    7 April 2009

    Due to a power outage in Hellerup, Copenhagen, e-conomic is currently unaccessible.

    We deeply regret the inconveniences this is causing for you, and are working hard to return access to e-conomic.


    Improving API performance beyond data classes

    6 April 2009

    In the e-conomic API, using data classes is a powerful way of optimizing the retrieval, creation or updating of multiple entity properties – or even multiple entities – to use as few round-trips as possible.

    However, many developers are put off by the fact that when using data classes, only simple-type properties are returned – thus still requiring round-trips to retrieve properties of referenced objects.

    Suppose, for example, that we wish to list the product number, product name and product group number of all our products. The following would seem to be the obvious way of achieving this:

    IProduct[] arrP = session.Product.GetAll();
    IProductData[] arrPD = session.ProductData.GetDataArray(arrP);

    foreach (IProductData pd in arrPD)
    {
    Console.WriteLine("Number: " + pd.Number.ToString());
    Console.WriteLine("Name: " + pd.Name);
    Console.WriteLine("Group: " + pd.ProductGroup.Number.ToString());
    Console.WriteLine("");
    }

    This should result in only two round-trips – one for Product.GetAll() and one for ProductData.GetDataArray() – right? Wrong! For each product group, the above code will also generate a round-trip to retrieve the product group number (effectively a call to the SOAP function ProductGroup_GetNumber). The final result is a number of round-trips equalling n+2, with n denoting the number of products. This seems very unnecessary – especially since the number of products is usually several orders of magnitude higher than the number of product groups.

    First of all – why are the data classes designed this way? Why does the IProductData class return IProductGroup object references instead of IProductGroupData objects? While that would seem to be more consistent with the point of data classes, consider the ramifications if all data objects were to include referred objects as other data objects. In the above example:

    IProductData would include IProductGroupData
    • which would include IAccountData
    • which might include IVatAccountData (IAccount.VatAccount is NULLable)
    • which would include IAccountData
    • etc.

    In other words, this could quickly result in massively ‘recursive’ retrievals of data one would in most cases not have any use for.

    Luckily, there is another way of reducing the number of round-trips to an absolute minimum: Retrieve all product groups in one go, and store them in a dictionary with IProductGroup as key:

    IProductGroup[] arrPG = session.ProductGroup.GetAll();
    IProductGroupData[] arrPGD = session.ProductGroupData.GetDataArray(arrPG);

    Dictionary<IProductGroup, IProductGroupData> pgd = new Dictionary<IProductGroup, IProductGroupData>();

    for (int x = 0; x < arrPG.Length; x++)
    {
    pgd[arrPG[x]] = arrPGD[x];
    }

    IProduct[] arrP = session.Product.GetAll();
    IProductData[] arrPD = session.ProductData.GetDataArray(arrP);

    foreach (IProductData pd in arrPD)
    {
    Console.WriteLine("Number: " + pd.Number.ToString());
    Console.WriteLine("Name: " + pd.Name);
    Console.WriteLine("Group: " + pgd[pd.ProductGroup].Number.ToString());
    Console.WriteLine("");
    }

    This generates only four round-trips: ProductGroup.GetAll(), ProductGroupData.GetDataArray(), Product.GetAll() and ProductData.GetDataArray() – irrespective of the number of products or product groups. A similar trick can be used with any data class that includes object references.

    A final, and somewhat related performance tip: Throwing the result of GetAll() at GetDataArray() may of course still lead to massive amounts of data being downloaded. As a general rule, we recommend that you split the GetAll() result into batches of 500 entities for the subsequent data retrieval. While this will of course result in slightly more round-trips, it is much less prone to timeouts in either end.

    Take a look at our API documentation.


    Performance bughunt

    3 April 2009

    We have continuously over the last 6 weeks experienced random slowdowns in our e-conomic system. Due to these slowdowns, You, our customers are experiencing unavailability or slow performance of our system.  We’re focusing together with our hosting partner to solve the problems, and using external database experts from two different vendors in Denmark. We have been developing logging tools to help us getting more information out of the system, so we can be better at pinpointing the slow-down cause.

    The problem being random means that we have no possibilities correlating the slowdown to other scheduled activities, hence making it more difficult to find and solve the problem. The slowdown has occured in the middle of the day, at night and on a Sunday morning – so even the time of day has no influence on the situation ;-) . This past Monday we had a database expert identifying further possible solutions which we deployed the night between april 1st-2nd.

    1) Max degree of parallelism was set from default (=0) to 4, and

    2) ‘read_committed_snapshot on’ was set followed by ‘with rollback immediate’, and

    3) we added 9 additional databasefiles to the  PRIMARY filegroup

    It is still too early to say whether they have killed the performance slowdown bug, but after 2,5 days of good performance and no measuable slowdowns, a small hope is surfacing.

    We are of cause regretting more than anyone the situation we’ve had, and will continue to monitor and improve the availability and performance.