Topic: AH01067: Failed to read FastCGI header

Hi.  Having trouble with an issue that I have been banging my head against, even after trying many potential solutions found through web research including some posts by Remi.

This is a php/Apache linux server using php-fpm.  It is running on an Amazon EC2 instance with Amazon Linux 2.  Php 7.2.16.  Apache 2.4.39

First, this issue just started happening.  The basic components involved have been working, and continue to work, for many years except for this one new use case.  We started using CDATA connectors to external data sources (e.g., a PostgreSQL database), and it seems like when the request takes more than a minute or so, a 503 error is returned somewhere along the way.  Php-fpm drops the child (and correctly starts a new one) and there is a segfault.  Here are the error messages from various places on the server that occur at the time of the error:

In the php-fpm error_log: “child 16618 exited on signal 11 (SIGSEGV) after 1447.671492 seconds from start”

In the /var/log/messages log: “kernel: php-fpm[17334]: segfault at 7f67559ff9c8 ip [HIDDEN] sp 00007ffc4f53b3e8 error 6 in libpthread-2.26.so[7f677439d000+19000]”

In the httpd ssl_error_log:
[Mon May 20 16:14:20.199845 2019] [proxy_fcgi:error] [pid 16585] [client IP ADDRESS HIDDEN:35282] AH01067: Failed to read FastCGI header
[Mon May 20 16:14:20.199868 2019] [proxy_fcgi:error] [pid 16585] (104)Connection reset by peer: [client IP ADDRESS HIDDEN:35282] AH01075: Error dispatching request to :

And the curl error that comes back and seems to halt our backend process is: “The requested URL returned error: 503 Service Unavailable”

I have researched and tried many fixes related to almost everything I can find on these errors and related topics (timeouts, proxy handling, code on our side, etc).  Again, everything else is working and continues to work just fine.  It is only this one use case that throws these errors (and even so, about 5% of the time it will work w/o error).  Have tried some suggestions from Remi found online.

I've focused a lot on timeout issues related to php and Apache, but again there are many uses of this server that run for very long periods of time (hours or more) and have never had this issue.  Just started appearing only for this specific use case of the CDATA drivers. 

I know there is a lot of other info that could help, but first wondering if anyone recognizes this sequence of issues and has an idea to throw out.

PM

Re: AH01067: Failed to read FastCGI header

Hi,

No immediate idea, but can you get a core file, and then a gdb backtrace ?

Laptop:  Fedora 38 + rpmfusion + remi (SCL only)
x86_64 builder: Fedora 39 + rpmfusion + remi-test
aarch64 builder: RHEL 9 with EPEL
Hosting Server: CentOS 8 Stream with EPEL, rpmfusion, remi