How we solved Magento core_url issues once and for all

website speedToday, we tell you the story of how we solved our Magento core_url issues, which might be very familiar and similar to you, since you’ve landed here.

If you have been working with Magento for long enough, chances are that sooner than later you’d have learnt that there are a two fundamental steps to follow if you are having issues:

  • It’s always the cache! Whenever something doesn’t work, clear the cache, and I mean, all of them! And the problem will (or not) go away, it’s like rebooting in Windows!
  • If that doesn’t work, run a full reindex, and repeat the above step 🙂

That’s why it’s is not only recommended, but also it’s very common to run reindexes daily, and most likely even more often, whenever we want to immediately reflect some changes in the front-end, such a new stock, new products, or new category-products assignments.

Reindexing often shouldn’t be a problem for small stores, but the more products you have, the more probable is that reindexing is going to cause you trouble due the long time that it takes, the amount of resources that it needs, and so on, as well as the fact that they need to run overnight to avoid slowing down your website dramatically.

Whenever a full site reindex is taking a long time, 99% of the cases “catalog_url” reindex is the one spending most of the time, specially if you have a large catalog of products and or if you have a multi-store setup.

Unfortunately or otherwise, that was our case at our bikes shop. We had several thousand products and, even though a big part of them are currently disabled, the reindex was taking far too long, and the table core_url_rewrite was growing exponentially, causing other problems such as slowing down the backup process, or preventing the sitemap generation from running, due big SQL queries taking far too long to run.

Like in most existing online stores, all of these could have been easily avoided if it had been addressed properly from the beginning, but you know… could’ve, would’ve, should’ve… but didn’t!

So, there we were, with a core_url_rewrite table with over 210k rows, and expanding beyond maintainability. We had only two chances:

  • Upgrade the servers and ignore/delay the problem until we reach to this same stage in a couple of months time.
  • Tackle the problem and invest the time needed to find out what’s going on, and sort it out properly

We love challenges, and the first option wasn’t even considered, as it wasn’t a solution really, so we got our hands on it and started to look into the possible causes.

We identified two major problems looking into core_url_rewrite table:

  • There were several thousand rows with URLs about products that were disabled or not visible individually. We had already installed Dn’D module that skips the reindex of disabled and not visible products, speeding up the reindex by keeping them out of the table, but there “leftovers” from the past, before the module was installed. If you don’t have Dn’D Patch Index Url module installed, please take a few minutes and install it right know, it’s a must have for any Magento store.
  • We noticed a very interesting issue: Every time we run a catalog_url reindex, the number of rows of the core_url_rewrite table was increased, even when there were no recent changes on the catalog. So, basically, the table was growing and growing each time without any obvious reason.

In order to address the first problem, we built two queries to identify the rows corresponding to:

Disabled products:

SELECT count(*) FROM core_url_rewrite WHERE product_id IN 
(SELECT entity_id FROM catalog_product_entity_int WHERE attribute_id = (SELECT attribute_id FROM eav_attribute WHERE attribute_code ='status') AND VALUE = 2 AND entity_type_id = 4);

Roughly 2k rows.

Then, we did the same to identify the products with Visibility set to Not Visible Individually:

SELECT count(*) FROM core_url_rewrite WHERE product_id IN 
(SELECT entity_id FROM catalog_product_entity_int WHERE attribute_id = (SELECT attribute_id FROM eav_attribute WHERE attribute_code ='visibility') AND VALUE = 1 AND entity_type_id = 4);

Nearly 8k rows.

Since the above queries return rows that correspond to products not being displayed on the front-end (because they are disabled of not visible in the catalogue), they can safely be removed without any impact on the front-end/SEO.

The infinite growing “duplicated” redirects on core_url_rewrite

This is a very interesting issue which is widely known among Magento developers. There is a good thread on StackExchange discussing it, where you can find responses of all kinds. In the above link, people have done a great job analysing the problem, and some even propose (partial) decent solutions to it, but given the complexity, many people have also given up to find a fix. In fact, Alan Storm himself, one of the most popular Magento developers advises so, and his answer is the second most voted, which by the way is pretty disappointing to say the least.

In our case, it was unacceptable to leave it as it was, because we were aware of the issue, and we had been delaying it for a while. Eventually, it was causing us serious issues, and it was having a pretty bad impact in the overall performance of our store, so it had to be addressed.

As we mentioned previously, every time we reindex the “catalog_url”, the amount of rows on the table core_url_rewrite was increasing, and we were trying to figure out why. It seems that the problem is caused when there are two or more products/categories with the same URL key. During the reindex, Magento checks whether the url key is unique, if it’s not, it appends the entity_id of the product to the URL to make it unique. The problem is that next time the reindex runs, it will do the same process, but the new URL (url_key + entity_id) already exists, so Magento tries increasing the numeric part of the url key (product’s entity id) until it finds a unused URL, and it keeps creating crazy redirects on each run.

One way to avoid/overcame this issue is making sure that the url_keys of the products are unique. We tested that approach in first place, running a query that modified the url_key (attribute of the product) of each product that has duplicates, appending the entity_id followed by a dash. Ie. product with id 123 and url_key “my_product” would become “my_product-123”. This way when the reindex runs, all URLs would be unique, and Magento wouldn’t have to do “his thing”. However, this would be messing up with existing URL redirects in core_rewrite_url and potentially would lead to broken links. Also, it wouldn’t fully resolve the issue, as nothing would be preventing new products from being created/imported with duplicated URL keys in the future, so we had to find a better solution.

Since many people had been investigating and debugging this problem, we decided to have a look at the proposed solutions on StackExchange and tested some of them. The solution proposed by @Simon was the one that made more sense for us, and it seemed to do the job. After applying the code changes, reindexing wouldn’t increase the number of rows of the table. However, the proposed solution was overriding the core, which is something we never do unless it’s something unavoidable. It’s a nice solution, wrapped  as an “unofficial Magento patch”, but we didn’t want to take the risk of having future upgrade/patching issues. So, we decided to create a new Magento extension, which would override the model Mage/Catalog/Model/Url.php, and we put Simon’s fix there, leaving the core untouched. It all seemed to work fine, and the table core_url_rewrite stabilised so the first step was done, now it was time for the tricky part, the remaining “leftovers”.

We had 210k rows (well, actually 200k after the previous clean-up of disabled/not visible products) on the core_url_rewrite table, but as we mentioned previously, we have only a couple of thousand products enabled in our catalog, so presumably many rows were not really “needed”, and most likely the “oversize” of the table was related to the infinite rewrites/redirects.

Looking at the latest entries of the core_url_rewrite, there was a clear pattern on the redirects:

Product_id Request_path Target Path
1234 product-url-key-1234.html product-url-key-1235.html
12 product-url-key-13.html product-url-key-14.html

Both, request_path and target_path look the same except for the number prepended to the url_key. So, we came up with this query to identify all the redirects:

SELECT count(*)  FROM core_url_rewrite WHERE product_id IS NOT NULL AND LEFT(request_path, length(`request_path`) - 5 - length(product_id)) = LEFT(target_path, length(`target_path`) - 5 - length(product_id)) AND
is_system = 0

Result: 195k rows!!! Unbelievable, we thought those rewrites were bad, but we didn’t expect that they would be THAT BAD!

Right, so we wanted to get rid of all that “junk” redirects, but we couldn’t just delete them. Well, technically we could, but that could have a very bad SEO impact, it would generate many 404 errors for products already indexed in the search engines, and there would be existing links to the products in newsletters, social media, etc. So we needed an strategy to fix it.

After a lot of testing and back and forth, eventually we did the following:

  • Create a table core_url_rewrite_tmp which contains only the redirects (can be done with one query, using create table as select):
    SELECT category_id, product_id, store_id, request_path, target_path  FROM core_url_rewrite WHERE product_id IS NOT NULL AND LEFT(request_path, length(`request_path`) - 5 - length(product_id)) = LEFT(target_path, length(`target_path`) - 5 - length(product_id));
  • Override Mage_Cms_IndexController::noRouteAction to capture 404 events
  • Then, whenever a 404 happens, check if the request matches the below pattern: '/\\/(.*)-\\d+.html/'
    • If it doesn’t, let Magento handle the 404 normally
    • If it does, query our new table and search for matches on the current store, fetching the product_id or category_id.
      • If there are no matches, let Magento handle the 404 normally
      • If there are matches, redirect the user to the corresponding product/category (301 – permanent redirect) URL

After having this kind of fallback mechanism, the redirect entries can be removed from the core_url_rewrite table.

Then, the core_url_rewrite table is completely clean and light, which, being one of the critical and most used and joined tables of Magento,  has a shocking impact in the overall performance of the site, not to mention the massive speed increase on heavy tasks such as sitemap generation, core url reindex, and so on.

In our case, as a result of this whole clean-up, the core_url_rewrite now has 5k rows. We still have to keep the “core_url_rewrite_tmp”, but that table is queried in very rare occasions (vs the frequency of core_url_rewrite), when an “old URL” is requested.

Last but not least, since we are sending all the requests to the right destination with a 301 redirect, within a couple of months, all the search engines would have updated their indexes and the table can be “safely” removed.

Note that there might still be old links in emails, social media, etc., but the fallback can then be amended to redirect users either to the catalog search (searching for the requested URI), or it can do a best-match on the core_url_rewrite and redirect the user to  the closest match. This won’t be 100% accurate as the products with the duplicated URL keys would have similar url keys/names. However, on this fast moving age, after 6 months those links are highly unlikely to be hit.

We deleted our table after 3 months, and two weeks later we haven’t found any related 404 in our reports.

Finally, our zipped database (excluding orders) dropped from 80Mb to 5Mb, all our sitemap generation and reindex issues were resolved, and our website overall speed/performance improved noticeably, so we couldn’t be happier 🙂

Magento 2 Composer Module Update

People often asks us how to do a Magento 2 Composer Module Update for their extensions whenever they want to release a new version of their modules.

There are different ways of doing it, we will show you how to do with with git tags and a semantic versioning strategy for the releases, which we thing is the most professional and reliable way.

We will continue with the example of our previous post where we showed you how to create a hello world extension in Magento 2 with composer.

For the newcomers to composer, et me quickly give you an brief explanation of how does it work, more specifically, how we use it together with Magento 2 while developing new modules:

  • To install a new extension developed by ourselves (or by a third party): We normally use “composer require vendor-name/extension-name” from the root folder of Magento 2. Which, by default (depending on the value we’ve set for “minimum-stability” and “prefer-stable” in our composer.json file) will attempt to install the latest version of that extension that meets its dependencies/requirements within your system/application. This means that, for example, if you are using php 5.5.x and the latest version of the extension that you are installing has a dependency (composer.json “require” section) like:
    "php":">=5.6"

    Then, the latest version of the extension cannot be installed on your system. Composer will firstly try to install a newer version of the dependency, and failing that, will search backwards previous versions of the extension until it finds the most recent version that works with php 5.5. Eventually, it will fail if it cannot find any version installable (that meets all the dependencies) in your system . Normally, you just run the command and everything runs smoothly, composer will take care of everything from you, but sometimes it will fail with an error like below:

    composer require lumbrales-software/magento2-first-module 
    Using version dev-master for lumbrales-software/magento2-first-module     
    ./composer.json has been updated
    Loading composer repositories with package information
    Updating dependencies (including require-dev)                             
    Your requirements could not be resolved to an installable set of packages.
    
      Problem 1
        - magento/project-community-edition dev-master requires magento/product-community-edition 2.0.2 -> no matching package found.
        - magento/project-community-edition 2.0.2 requires magento/product-community-edition 2.0.2 ->  no matching package found.
        - magento/project-community-edition 2.0.1 requires magento/product-community-edition 2.0.1 ->  no matching package found.
        - magento/project-community-edition 2.0.0-rc2 requires magento/product-community-edition 2.0.0-rc2 ->  no matching package found.
    

    This can sometimes require a system update, or like in the above example, adding a custom repository where composer can find the required package.

  • To release a new version (code changes) of your extension: This is the interesting part. People are usually confused and run “composer update” and expect it to do the job, but the problem is that the above command can often lead to undesired and unexpected results, since it will try to update every single package, which is not always (almost never) what we want. Normally, you want to update an specific package, or maybe even a whole vendor, but not everything at once. Therefore, we recommend to tell composer what to do, like in the example of our previous post:
    # Update an specific package
    composer update lumbrales-software/magento2-first-module
    # Update all packages of the vendor
    composer update lumbrales-software/*
    

Having said that, now let’s focus on how to release a new version of our extension.

The process is actually simpler than it might seem, all you have to do is implement your new functionality, code changes, whatsoever, using whichever git branching strategy you want, and once you are ready, you can do the following steps:

  • Merge your changes into the master branch
  • Choose a version for the update/release, following the guidelines of semantic versioning:
    Given a version number MAJOR.MINOR.PATCH, increment the:

    1. MAJOR version when you make incompatible API changes,
    2. MINOR version when you add functionality in a backwards-compatible manner, and
    3. PATCH version when you make backwards-compatible bug fixes.

    Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.
    Note: If you want to release a stable version, it has to be at least 1.0.0, otherwise it will be considered unstable by composer and might lead to issues while installing it.

  • Update your composer.json with the new version (ie. “version”: “1.0.1”,) and commit the changes of the file.
  • Create a git tag with the new version:
    git tag 1.0.1
  • Push the changes and the tag:
    git push origin master
    git push --tag
  • That’s it! Now, you should be able to run composer update your-vendor/your-module and it will fetch your changes:
composer update lumbrales-software/magento2-first-module -v
Loading composer repositories with package information
Reading composer.json of lumbrales-software/magento2-first-module (1.0.2)
Importing tag 1.0.2 (1.0.2.0)
Reading composer.json of lumbrales-software/magento2-first-module (1.0.1)
Importing tag 1.0.1 (1.0.1.0)
Reading composer.json of lumbrales-software/magento2-first-module (master)
Importing branch master (dev-master)
Updating dependencies (including require-dev)
  - Removing lumbrales-software/magento2-first-module (1.0.1)
  - Installing lumbrales-software/magento2-first-module (1.0.2)
    Downloading: 100%         
    Extracting archive

Writing lock file
Generating autoload files

Note: Sometimes composer seems to keep cached the repository changes and won’t fetch your new code changes, I found a workaround for that by clearing the cache with the below command (run it as your own risk):

rm -fr ~/.composer/cache/*

That’s about it, you now should be able to release code changes nicely, and for every new code changes you just need to bump the version accordingly and remember to push both the version update in the composer.json and the git tag.

Good luck!

Let us know if you have any issues with it and we’ll try to assist you.

Evil Magento extensions

Evil Magento Extensions I: Email marketing

magentoMagento Commerce is a really good source of extensions, some of them are excellent, some of them are cheap or even free, some of them are not great… and some of them are what we call “Evil Magento Extensions”!

First of all we would like to apologise if anyone gets offended, is not the purpose of this post, but to contribute to Magento ecosystem by sharing our experiences and always aiming to improve Magento community.

We always want the best Magento Extensions for our store and we want to spend as less money as possible, but depending on your specific shop needs you might want to think twice before installing one of those Evil Magento Extensions.

There are way too many extensions out there and we obviously haven’t tried them all, but from our experience usually the ones provided by email marketing platforms are the more interesting ones. It is very common to use a third party service to send the transactional emails, and they mostly tend to provide a free (and evil) Magento Extension which is pretty much “plug & play” into your store, so that you don’t have to worry about anything…. or maybe should you?

The usual case scenario is that when a user places an order, or creates a new account, an email is sent, and we’ve found out that most of this extensions relay on Magento Observers (ie. customer_register_success, sales_order_place_after), or even rewrite the Mage_Core_Model_Email_Template class in order to push the data to their servers. That’s fine until here, if it wasn’t because almost all of them do it synchronously, ie. on the same request made by your customer. This means that for instance, when your user is placing a new order, she has to wait until your server communicates with your email platform: sending the data of the order, and waiting for a response. This is a “standard” process that usually doesn’t take longer than a few milliseconds/seconds which is fine… hang on, are you sure?

The astonishing truth is that this process can (and will) hinder your customers experience in many different ways, but you probably already know about them if you are reading this post. A request from your store to an external server can take any time from few milliseconds to several seconds, minutes, or even hours! and many things can go wrong while that’s happening, often leading to an error displayed to your customer, or just to another abandoned cart, tired of waiting.

Ok, I might be exaggerating a bit, there would surely be a timeout error somewhere between this process that would prevent a request to take hours, but sadly, we have seen MANY cases where a request would hang for several minutes. Of course, it’s not the usual case, but it will happen sooner than later. It happens to Amazon, to Facebook, even to Google, it happens to the best of us, it might be due networks issues, power failures, too many concurrent users, or whatever reason.

The point is that your store, your customers, will be affected and if they have a bad experience they will not come back. Bear in mind that it’s not necessary a delay of minutes to impact your sales, just a few extra seconds can prevent your customers from having a successful check out experience.

The reason behind it is that when a new order is placed, Magento has to do quite a few database operations (decrease the stock of all products, insert all order items, save all customer data, etc. and finally create the order in sales_flat_order table) and in order to preserve data integrity, it wraps them all within a DDL transaction in the database. In other words, Magento will execute all these operations all at once, so that if an error occurs, the whole transaction will be rolled back and no changes will be done in the database.

So, what happens if there is an observer in the middle of this complex process that performs a request to the external server of your email provider? Yes, it will delay the process a few seconds. You might thing that it’s not a big deal, but did you know that while this is happening, some tables of your database are being locked by this transaction, which is effectively preventing new orders from being placed until that transaction has finished?

In other words, if two customers are placing an order at almost the same time, the second one will be blocked by the first one. Internally, the queries in the database will be queued up and executed as soon as the first transaction finished, but queued queries will wait for a limited time (typically 50 seconds) and if the lock hasn’t been released by then, an error/exception will be thrown and therefore the order will not be placed. It might seem an unlikely event, but unfortunately, It happens more often than we wish, and the more concurrent customers you have, the more probabilities for this to happen.

From our experience, we’ve seen it happening in shops with low traffic during sale periods, usually when there are around 100+ concurrent users it can happen a few times day, and with 300+ concurrent users it can happen almost every hour.

If you see something like this in your exceptions.log you might well be suffering from one of those Evil Magento Exceptions:

exception 'PDOException' with message 'SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction' in .../lib/Zend/Db/Statement/Pdo.php:228

.../app/code/core/Mage/Sales/Model/Resource/Order/Abstract.php(425): Mage_Core_Model_Resource_Db_Abstract->save(Object(Mage_Sales_Model_Order))

We’ve seen this issue with many popular Magento Extensions and well known email marketing platforms such as Dotmailer, ExactTarget or Sailthru, so next time you are in about to install a new Magento Extension, make sure it’s reviewed first by a Magento Expert to avoid awkward surprises.

If you stumble upon with this issue, do not hesitate to get in touch, and we will be able to assist you.

How-to create a Magento 2 Composer Module

MAgento 2If you are a developer new to Magento 2, you are probably wondering how to create a Magento 2 Composer Module. Hopefully the below steps will be useful to you. This is how I managed to do it, eventually.

First of all, you should familiarise yourself with composer, and it’s basic usage.

This guide asumes you’ve already installed and configured a vanilla Magento 2 project. If you haven’t done that yet, I’d advise you to check this out and follow the official guide. Feel free to drop me a line if you have any issues.

I’ve found quite a few sites that show how to create a Magento 2 module “the old way”, by copying the module contents inside app/code folder (note that in Magento 1.x it used to be app/code/codePool, but that doesn’t exist anymore in Magento 2).

However, as I’ve recently been working on Symfony 2 projects, I’m quite used to composer, and it’s usage to easily integrate third party components/packages. So, once I saw the file “composer.json” on the root directory of Magento 2, I knew that it should be the way forward in order to develop Magento 2 extensions.

Let’s get started. I’m assuming you’ve developed already your Magento 2 extension, as the purpose of this guide is to show how to package the extension within a composer (sub) module, and the creation of an extension would go beyond the scope of this post. Check this guide to find out how to create a Magento 2 extension, or just clone it’s source code from github.

  • To begin with, you should create a new (github/bitbucket…) repository, and place your source code inside.
  • In the root folder, create a new file named composer.json. This fille will describe the details of your package. This is how mine looks like:
    {
        "name": "lumbrales-software/magento2-first-module",
        "description": "Hello world Magento 2 Composer Module!",
        "require": {
            "magento/project-community-edition": "*"
        },
        "type": "magento2-module",
        "version": "0.1.0",
        "extra": {
            "map": [
                [
                    "*",
                    "Yourcompany/YourModule/"
                ]
            ]
        },
        "authors": [
            {
                "name": "Javi Lumbrales",
                "homepage": "https://www.lumbrales-software.com/",
                "role": "Developer"
            }
        ]
    }

    The most important parts described below:

    • The name of the package: This will serve as identifier of your package, and will be used later on.
    • Requirements: We’ve added “magento/project-community-edition” as a requirement, as this module requires Magento 2 in order to work.
    • Type: We specify “magento2-module” so that it’s contents are copied to app/local once the package is installed (more about this later).
    • Version: Not important, as we’ll use git tagging to handle the composer module updates.
    • Extra: – Map: This tells composer to copy all (*) contents into app/code/LumbralesSoftware/HelloWorld, so the root of your extension should be structured already. This means that you don’t need to keep your files under app/code/YourCompany/YourModule in your repository. Instead, they should be in the root straightly. This is how it your repository should look like:

Repository Root:
Block/
Controller/
etc/

  • Push all your code to the master branch (I’ll explain the tagging on a separated post later on)
  • Go to the root folder of your Magento 2 installation, and type the following commands in your terminal:
composer require magento/magento-composer-installer

This is a submodule required to properly install third party Magento 2 extensions as composer submodules (Remember the type:magento2-module mentioned at the beginning).

  • Then, as your module is still not in a packages repository such as packagist.org, or packages.magento.com, you need to specify your own VCS repository so that composer can find it. In the repositories section of the composer.json file of the Magento 2 project (if the repositories section doesn’t exist, create it, else, add your repository at the end) add the following:
 "repositories": [
    {
      "type": "vcs",
      "url": "https://github.com/youraccount/yourmodule-repository"
    }
  ],
  • Since your package is in a development stage, you will need to add the minimum-stability as well to the composer.json file:
"minimum-stability": "dev",

Note that this could be removed later on, once we have properly tagged/released a stable version of our package/extension.

  • After that, you should be able to install your module as follows (remember how you named it):
composer require your-package/name #In our example the name was lumbrales-software/magento2-first-module

The above command should download your module and copy it’s content to app/code/YourCompany/YourModule

  • Last, but not least, you will need to enable your module and clear Magento cache. Add a new entry in app/etc/config.php, under the ‘modules’ section (Magento 2 project):
    'Yourcompany_YourModule' => 1,

Note that this name should match with the name that you’ve put on the composer.json, section extra -> map.

That’s it, after that your module should be ready to go.

You can download the source code of my Magento 2 Composer Module sample here.

I hope it helps, and let me know if you have any issues!

In my next post I’ll show a basic usage of git tagging and composer to be able to easily release your module updates.

Credits:
https://www.ashsmith.io/2014/12/simple-magento2-controller-module/
https://alankent.wordpress.com/2014/08/03/creating-a-magento-2-composer-module/ 

 

Proxy & Debug SOAP Requests in PHP

crm_soap_php1Last week, we needed to debug soap requests in php due an issue on our live environment. We attempted to integrate our Magento store with an external provider that used SOAP to communicate to our severs. Everything was working fine on our testing/staging environments, but when we went live, they contacted us due an issue in the connection:

Error Fetching http body, No Content-Length, connection closed or chunked data.

After a quick search, we found an easy fix by falling back to HTTP 1.0 on the client connection:

$client = new SoapClient("http://yourserver.com/api/?wsdl",

array(
 'stream_context' => stream_context_create(array('http' => array('protocol_version' => 1.0) ) )
 )
 );

However, to make things more interesting, the external service wasn’t willing to do any changes on their side, which means that we needed to find another solution, on our side.

It did seem to be an environment related issue rather than a code problem. Anyway, we tried to debug the whole connection flow, to see if we could find clues of how to solve the problem.

We have a sample script that we use to debug connections locally (using xdebug), with the code below:


class Debug_SoapClient extends SoapClient
{
 public function __doRequest($request, $location, $action, $version, $one_way = 0)
 {
 $aHeaders = array(
 'Method: POST',
 'Connection: Close',
 'Content-Type: text/xml',
 'SOAPAction: "'.$action.'"'
 );

$ch = curl_init($location);
 curl_setopt_array(
 $ch,
 array(
 CURLOPT_VERBOSE => false,
 CURLOPT_RETURNTRANSFER => true,
 CURLOPT_POST => true,
 CURLOPT_POSTFIELDS => $request,
 CURLOPT_HEADER => false,
 CURLOPT_HTTPHEADER => $aHeaders,
 CURLOPT_SSL_VERIFYPEER => true,
 CURLOPT_COOKIE => "XDEBUG_SESSION=PHPSTORM"
 )
 );

$ret = curl_exec($ch);
 return $ret;
 }
}

//Use soap as always, just replacing the classname while instantiating the object:
$client = new Debug_SoapClient('http://myurl.com/api/?wsdl');

However, we can’t use xdebug on live environment, so we tried to find a way to debug the connection flow, so that we could compare the differences of using HTTP 1.0  vs the standard way.

After a couple of unsuccessful attempts, we found a hacky way to proxy-forward the requests through curl, so that we could debug the input and the output of each call.

Firstly, we created a test php file that would log the environment variables on a file for each request, and then it would forward it to the soap api:

file_put_contents(__DIR__ . '/soap.log', var_export($_SERVER, true), FILE_APPEND);
file_put_contents(__DIR__ . '/soap.log', var_export($_REQUEST, true), FILE_APPEND);
file_put_contents(__DIR__ . '/soap.log', var_export($_GET, true), FILE_APPEND);
file_put_contents(__DIR__ . '/soap.log', var_export($_POST, true), FILE_APPEND);
file_put_contents(__DIR__ . '/soap.log', var_export($HTTP_RAW_POST_DATA, true), FILE_APPEND);
file_put_contents(__DIR__ . '/soap.log', var_export($_GET, true), FILE_APPEND);
// Override the REQUEST_URI variable so that the framework can understand and process the soap request
$_SERVER['REQUEST_URI'] = '/api/?wsdl';
include 'index.php';

Note that sometimes $_POST might be empty but $HTTP_RAW_POST_DATA might contain data.

After comparing the logs of the two requests, we noticed few differences on the $_SERVER variables, but even overriding the values didn’t help.

More research revealed that the issue might be related with a bug in some php versions:

It is an HTTP 1.1 issue with some versions of PHP not properly decoding chunked data (even some versions of PHP 5.3.x will still see this error, the documentation on PHP’s official site is wrong). You have two options in this case:

(1) Update your version of PHP to 5.4.x or later.
(2) Force the client library to use HTTP 1.0

We tried to connect using another app-server with an older php version, and it worked without having to fallback to http 1.0.

Then we thought about proxy-ing the requests to the other server internally, just to test if that would solve the issue. Eventually we managed to get the requests forwarded with another script file:


header('Content-Type: text/xml; charset=UTF-8');
$ch = curl_init('https://external.server.com/soaptest.php/' . ($_GET ? '?' . http_build_query($_GET) : 'api/index/index/'));
curl_setopt_array(
 $ch,
 array(
 CURLOPT_VERBOSE => false,
 CURLOPT_RETURNTRANSFER => true,
 CURLOPT_POST => $_SERVER['REQUEST_METHOD'] == 'POST' ? true : false,
 CURLOPT_POSTFIELDS => $HTTP_RAW_POST_DATA,
 CURLOPT_HEADER => false,
 CURLOPT_SSL_VERIFYPEER => false,
 CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_0,
 )
 );
$ret = curl_exec($ch);
file_put_contents(__DIR__ . '/soap.log', var_export($ret, true), FILE_APPEND);
echo $ret;

With this script curl-ing the previous one on the other server, we managed to get the SOAP calls to work. It’s something that we did only for debugging purposes, but hopefully it can be useful for you in case you face similar issues and you want to discard possibilities. Sadly, that still didn’t solve the issue for the non http 1.0 requests, but at least we were able to compare all the inputs and all the outputs and we noticed that it was all working fine both ways until the actual soap call:

– Instantiating the class with the url

– Authenticating to soap by calling login()

– Calling to a custom soap call: ie. sales_order.info would return the proper results in both cases, but only the http 1.0 connection would properly retrieve them. The standard connection would retry the call and eventually display the error message showed below.

Eventually, we solved the issue by temporarily making the requests directly hitting the old php version server, until the php version was upgraded on the non working server.