Magento Commerce API Fun 🙄

As a consultant you get to work with many different systems, some easy, some difficult and some downright weird.

Magento Commerce fits firmly into the last category.

For those that don’t know Magento is a very popular E-commerce platform owned by Adobe and used by some of the largest companies in the world.

How APIs Normally Work

When using APIs it’s standard practice to use pagination to avoid retrieving too many records at once and potentially maxing out the memory resources of either the server or your client.

Pagination is requesting records in batches and then selecting the first ‘page’ of data followed by the next and so on until all the data is downloaded.

Their Commerce API is well documented and at first look seems pretty standard.

Magento uses a fairly standard syntax to request pages, this requests 1000 records per batch and the first page of results.

&searchCriteria[pageSize]=1000&searchCriteria[currentPage]=1

It’s the return data that’s a mess.

Normally the return message would include a link to the next page, not Magento.

Normally if you request a page beyond the end of the dataset you’ll get no data,

Magento just returns the last items over and over again.

To find out if I’ve reached the end of the dataset I ended up reading the response, removing the page number, then hashing the contents and comparing to the hash of the previous returned page. If they’re the same I’m getting the same data again so I can stop 🙄

try:
    # get next page
    URL = base_URL.replace('[currentPage]=1', f'[currentPage]={i}')
    r = requests.get(URL, auth=headeroauth)

    # remove page number and hash contents
    hash = gethash(r.text.replace(f'"current_page":{i}','').encode('utf-8'))
        if hash == last_hash:
            # I'm at the end
            break
        else:
            last_hash = hash
            print(URL)
            with open(f'{download_dir}customer_page{i}.json', 'w') as f:
                f.write(r.text)
except Exception as e:
    print(str(e))

This works most of the time except in the circumstance where the underlying data changes between the two ‘final’ calls so I get the same data set again but with a few ‘extra’ records. Because of thisI I still have to check for duplicates before loading the data into a database.

While Googling for the issue I was pleased to find that Reddit tends to agree with my sentiments 😁

Reddit comment on Magento API
Reddit agrees with my sentiments

Leave a comment

Your email address will not be published. Required fields are marked *