http: unify header treatment by marco-ippolito · Pull Request #46528 · nodejs/node

marco-ippolito · 2023-02-06T17:01:40Z

resolves: #46395
~~I'm not sure what the test test/parallel/test-http-server-response-standalone.js impacted by this change was trying to achieve @Trott~~

nodejs-github-bot · 2023-02-06T17:01:45Z

Review requested:

@nodejs/http
@nodejs/net

Trott · 2023-02-06T17:41:23Z

I'm not sure what the test test/parallel/test-http-server-response-standalone.js impacted by this change was trying to achieve @Trott

It was added in #14387 by @mcollina so maybe he can add context if that PR doesn't supply the information you need.

climba03003

It doesn't look right to me.
The header should be process before ._send, so it can be ensure either merge with data and pending for first write always receive the same byte.

ShogunPanda · 2023-02-06T20:28:39Z

@climba03003 Are you referring to the test, right?

climba03003 · 2023-02-06T20:33:51Z

@climba03003 Are you referring to the test, right?

I means the implementation.
Combining header and data is used to increase the throughput.
Remove !encoding is a de-optimization since most of the people doesn't specify encoding.

The trade-off is process the header with consistence encoding before combine to data.
Add a bit of processing time, but still combine header and data in most of the case.

node/lib/_http_outgoing.js

Lines 438 to 587 in a2a954b

    
           function _storeHeader(firstLine, headers) { 
        
             // firstLine in the case of request is: 'GET /index.html HTTP/1.1\r\n' 
        
             // in the case of response it is: 'HTTP/1.1 200 OK\r\n' 
        
             const state = { 
        
               connection: false, 
        
               contLen: false, 
        
               te: false, 
        
               date: false, 
        
               expect: false, 
        
               trailer: false, 
        
               header: firstLine 
        
             }; 
        
             if (headers) { 
        
               if (headers === this[kOutHeaders]) { 
        
                 for (const key in headers) { 
        
                   const entry = headers[key]; 
        
                   processHeader(this, state, entry[0], entry[1], false); 
        
                 } 
        
               } else if (ArrayIsArray(headers)) { 
        
                 if (headers.length && ArrayIsArray(headers[0])) { 
        
                   for (let i = 0; i < headers.length; i++) { 
        
                     const entry = headers[i]; 
        
                     processHeader(this, state, entry[0], entry[1], true); 
        
                   } 
        
                 } else { 
        
                   if (headers.length % 2 !== 0) { 
        
                     throw new ERR_INVALID_ARG_VALUE('headers', headers); 
        
                   } 
        
                   for (let n = 0; n < headers.length; n += 2) { 
        
                     processHeader(this, state, headers[n + 0], headers[n + 1], true); 
        
                   } 
        
                 } 
        
               } else { 
        
                 for (const key in headers) { 
        
                   if (ObjectPrototypeHasOwnProperty(headers, key)) { 
        
                     processHeader(this, state, key, headers[key], true); 
        
                   } 
        
                 } 
        
               } 
        
             } 
        
             let { header } = state; 
        
             // Date header 
        
             if (this.sendDate && !state.date) { 
        
               header += 'Date: ' + utcDate() + '\r\n'; 
        
             } 
        
             // Force the connection to close when the response is a 204 No Content or 
        
             // a 304 Not Modified and the user has set a "Transfer-Encoding: chunked" 
        
             // header. 
        
             // 
        
             // RFC 2616 mandates that 204 and 304 responses MUST NOT have a body but 
        
             // node.js used to send out a zero chunk anyway to accommodate clients 
        
             // that don't have special handling for those responses. 
        
             // 
        
             // It was pointed out that this might confuse reverse proxies to the point 
        
             // of creating security liabilities, so suppress the zero chunk and force 
        
             // the connection to close. 
        
             if (this.chunkedEncoding && (this.statusCode === 204 || 
        
                                          this.statusCode === 304)) { 
        
               debug(this.statusCode + ' response should not use chunked encoding,' + 
        
                     ' closing connection.'); 
        
               this.chunkedEncoding = false; 
        
               this.shouldKeepAlive = false; 
        
             } 
        
             // keep-alive logic 
        
             if (this._removedConnection) { 
        
               // shouldKeepAlive is generally true for HTTP/1.1. In that common case, 
        
               // even if the connection header isn't sent, we still persist by default. 
        
               this._last = !this.shouldKeepAlive; 
        
             } else if (!state.connection) { 
        
               const shouldSendKeepAlive = this.shouldKeepAlive && 
        
                   (state.contLen || this.useChunkedEncodingByDefault || this.agent); 
        
               if (shouldSendKeepAlive && this.maxRequestsOnConnectionReached) { 
        
                 header += 'Connection: close\r\n'; 
        
               } else if (shouldSendKeepAlive) { 
        
                 header += 'Connection: keep-alive\r\n'; 
        
                 if (this._keepAliveTimeout && this._defaultKeepAlive) { 
        
                   const timeoutSeconds = MathFloor(this._keepAliveTimeout / 1000); 
        
                   let max = ''; 
        
                   if (~~this._maxRequestsPerSocket > 0) { 
        
                     max = `, max=${this._maxRequestsPerSocket}`; 
        
                   } 
        
                   header += `Keep-Alive: timeout=${timeoutSeconds}${max}\r\n`; 
        
                 } 
        
               } else { 
        
                 this._last = true; 
        
                 header += 'Connection: close\r\n'; 
        
               } 
        
             } 
        
             if (!state.contLen && !state.te) { 
        
               if (!this._hasBody) { 
        
                 // Make sure we don't end the 0\r\n\r\n at the end of the message. 
        
                 this.chunkedEncoding = false; 
        
               } else if (!this.useChunkedEncodingByDefault) { 
        
                 this._last = true; 
        
               } else if (!state.trailer && 
        
                          !this._removedContLen && 
        
                          typeof this._contentLength === 'number') { 
        
                 header += 'Content-Length: ' + this._contentLength + '\r\n'; 
        
               } else if (!this._removedTE) { 
        
                 header += 'Transfer-Encoding: chunked\r\n'; 
        
                 this.chunkedEncoding = true; 
        
               } else { 
        
                 // We should only be able to get here if both Content-Length and 
        
                 // Transfer-Encoding are removed by the user. 
        
                 // See: test/parallel/test-http-remove-header-stays-removed.js 
        
                 debug('Both Content-Length and Transfer-Encoding are removed'); 
        
               } 
        
             } 
        
             // Test non-chunked message does not have trailer header set, 
        
             // message will be terminated by the first empty line after the 
        
             // header fields, regardless of the header fields present in the 
        
             // message, and thus cannot contain a message body or 'trailers'. 
        
             if (this.chunkedEncoding !== true && state.trailer) { 
        
               throw new ERR_HTTP_TRAILER_INVALID(); 
        
             } 
        
             this._header = header + '\r\n'; 
        
             this._headerSent = false; 
        
             // Wait until the first body chunk, or close(), is sent to flush, 
        
             // UNLESS we're sending Expect: 100-continue. 
        
             if (state.expect) this._send(''); 
        
           } 
        
           function processHeader(self, state, key, value, validate) { 
        
             if (validate) 
        
               validateHeaderName(key); 
        
             if (ArrayIsArray(value)) { 
        
               if ( 
        
                 (value.length < 2 || !isCookieField(key)) && 
        
                 (!self[kUniqueHeaders] || !self[kUniqueHeaders].has(StringPrototypeToLowerCase(key))) 
        
               ) { 
        
                 // Retain for(;;) loop for performance reasons 
        
                 // Refs: https://github.com/nodejs/node/pull/30958 
        
                 for (let i = 0; i < value.length; i++) 
        
                   storeHeader(self, state, key, value[i], validate); 
        
                 return; 
        
               } 
        
               value = ArrayPrototypeJoin(value, '; '); 
        
             } 
        
             storeHeader(self, state, key, value, validate); 
        
           }

ShogunPanda · 2023-02-06T20:36:01Z

I knew about the optimization. I was double checking.
I guess when encoding it's not set we have to make sure we correctly encode the headers. Hoping it does not result in big performance loss.

climba03003 · 2023-02-06T20:53:24Z

I guess when encoding it's not set we have to make sure we correctly encode the headers. Hoping it does not result in big performance loss.

Inside the code, node.js is always expecting the header is latin1 encoding.
So, we can do something like Buffer.from(headerValue, 'latin').toString('utf-8') in _storeHeader.
Maybe there is another approach which is more performant.

In this case, we can expect the non-string branch always utf-8. Since we do the transformation already.
In string branch, latin1 or utf-8 encoding doesn't really matter. Since the byte already changed.

// from utf-8 to latin1
const first = Buffer.from('å', 'utf8').toString('latin1')
console.log(first, Buffer.from(first)) // <Buffer c3 83 c2 a5>

// here is what should be done before hand
const second = Buffer.from('å', 'latin1').toString('utf8')
console.log(second, Buffer.from(second)) // <Buffer ef bf bd>

// when data is latin1
const third = Buffer.from('ef bf bd', 'hex')
console.log(third, Buffer.from(third, 'latin1')) // <Buffer ef>
// when data is utf-8 or no encoding
const fouth = Buffer.from('ef bf bd', 'hex')
console.log(fouth, Buffer.from(fouth, 'utf8')) // <Buffer ef>

Edit: hmm, buffer truncated doesn't seems correct.

climba03003 · 2023-02-06T20:59:58Z

The problem is actually cause by content-disposition allow latin1 but other header is always restricted to ASCII.

Maybe special handling of content-disposition in _storeHeader should do the trick without massive change.
Same as nodejs/undici#1903 (comment)

marco-ippolito · 2023-02-06T22:17:46Z

@climba03003 I've changed it to check inside processHeader if the header is content-disposition and there is content-length it encodes it in latin1. It will not impact performance but it feels more like a workaround rather than a full solution.

ShogunPanda · 2023-02-07T10:22:16Z

The RFC itself is a special case so I guess we have to act in "workaround way".

LGTM.

ShogunPanda · 2023-02-10T15:28:05Z

@nodejs/http Are we good on this?

mcollina

lgtm

marco-ippolito · 2023-02-15T10:37:13Z

I think this is going to be a breaking change wdyt? @ronag @Trott

ronag · 2023-02-15T10:38:16Z

I think it's more of a bug fix than a breaking change...

nodejs-github-bot · 2023-02-20T10:05:18Z

CI: https://ci.nodejs.org/job/node-test-pull-request/49772/

nodejs-github-bot · 2023-02-20T11:29:51Z

CI: https://ci.nodejs.org/job/node-test-pull-request/49778/

nodejs-github-bot · 2023-02-21T07:26:58Z

CI: https://ci.nodejs.org/job/node-test-pull-request/49817/

nodejs-github-bot · 2023-02-21T08:59:57Z

CI: https://ci.nodejs.org/job/node-test-pull-request/49821/

nodejs-github-bot · 2023-02-21T10:11:45Z

CI: https://ci.nodejs.org/job/node-test-pull-request/49825/

ArsalanDotMe · 2023-12-01T08:23:45Z

This PR introduced a bug #50978 and there is an associated PR #50977 to fix it. Please review when possible.

Per HTTP spec, header values are byte strings that should be decoded using isomorphic decode (latin1), not UTF-8. The new interceptor API (onResponseStart) was incorrectly using UTF-8. This change: - Updates parseHeaders and parseRawHeaders to use latin1 encoding - Removes the content-disposition workaround that was needed when headers were inconsistently decoded (see nodejs/node#46528) - Adds tests to verify latin1 decoding behavior Fixes #4753

http: unify header treatment

a2a954b

nodejs-github-bot added http Issues or PRs related to the http subsystem. needs-ci PRs that need a full CI run. labels Feb 6, 2023

climba03003 reviewed Feb 6, 2023

View reviewed changes

http: processHeader check isContentDispositionField

89d6e25

ShogunPanda approved these changes Feb 7, 2023

View reviewed changes

mcollina approved these changes Feb 11, 2023

View reviewed changes

ronag approved these changes Feb 15, 2023

View reviewed changes

ShogunPanda approved these changes Feb 20, 2023

View reviewed changes

ShogunPanda added the request-ci Add this label to start a Jenkins CI on a PR. label Feb 20, 2023

github-actions Bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Feb 20, 2023

github-actions Bot mentioned this pull request Feb 21, 2023

CI Reliability 2023-02-21 nodejs/reliability#502

Open

26 tasks

ShogunPanda added the commit-queue Add this label to land a pull request using GitHub Actions. label Feb 21, 2023

jk1z mentioned this pull request Oct 17, 2023

Incorrect validation of non-utf8 characters in setHeader function #50213

Open

lux01 mentioned this pull request Nov 27, 2023

content-disposition request header is incorrectly handled in nodejs 19.9.0 node-fetch/node-fetch#1734

Open

This was referenced Nov 30, 2023

http: handle multi-value content-disposition header correctly #50977

Merged

multi-value content-disposition header value results in invalid header value #50978

Closed

github-actions Bot mentioned this pull request Apr 24, 2024

Adding VersionStream for nodejs-22 wolfi-dev/os#17646

Merged

github-actions Bot mentioned this pull request May 26, 2025

Bump astral-sh/setup-uv from 6.0.1 to 6.1.0 some-natalie/jekyll-in-a-can#16

Merged

mcollina mentioned this pull request Jan 25, 2026

fix: decode HTTP headers as latin1 instead of utf8 nodejs/undici#4768

Merged

4 tasks

This was referenced Jan 28, 2026

Update @actions/core 2.0.2 → 2.0.3 (patch) guibranco/github-badges-action#527

Merged

build(deps): bump the npm_and_yarn group across 1 directory with 2 updates guibranco/github-status-action-v2#245

Merged

github-actions Bot mentioned this pull request Jan 29, 2026

Bump github/codeql-action from 34cae51104e71d6c91312a99711573bf23ab40d2 to f985be5b50bd175586d44aac9ac52926adf12893 some-natalie/fedora-acs-override#250

Closed

This was referenced Jan 30, 2026

Update @actions/core 2.0.3 → 3.0.0 (major) guibranco/github-badges-action#529

Merged

Bump @actions/core from 2.0.2 to 3.0.0 guibranco/github-badges-action#533

Closed

Bump @actions/core from 2.0.2 to 3.0.0 guibranco/github-file-reader-action-v2#553

Closed

github-actions Bot mentioned this pull request Feb 10, 2026

Bump github/codeql-action from 4.32.1 to 4.32.2 some-natalie/kubernoodles#538

Merged

socket-security Bot mentioned this pull request Mar 19, 2026

chore(npm): bump @actions/core from 1.11.1 to 3.0.0 in the actions-toolkit group guibranco/github-screenshot-action#31

Merged

socket-security Bot mentioned this pull request Apr 8, 2026

build(deps): bump undici from 6.23.0 to 6.24.1 guibranco/github-status-action-v2#265

Merged

socket-security Bot mentioned this pull request Apr 27, 2026

Bump @actions/core from 2.0.2 to 3.0.1 guibranco/github-file-reader-action-v2#603

Merged

This was referenced May 8, 2026

build(deps): Bump the npm_and_yarn group across 1 directory with 2 updates guibranco/github-readme-stats#80

Merged

chore(npm): bump @actions/core from 1.11.1 to 3.0.1 in the actions-toolkit group guibranco/github-screenshot-action#95

Merged

socket-security Bot mentioned this pull request May 25, 2026

Bump @actions/core from 2.0.2 to 3.0.1 guibranco/github-file-reader-action-v2#619

Open

Uh oh!

Conversation

marco-ippolito commented Feb 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nodejs-github-bot commented Feb 6, 2023

Uh oh!

Trott commented Feb 6, 2023

Uh oh!

climba03003 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ShogunPanda commented Feb 6, 2023

Uh oh!

climba03003 commented Feb 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ShogunPanda commented Feb 6, 2023

Uh oh!

climba03003 commented Feb 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

climba03003 commented Feb 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marco-ippolito commented Feb 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ShogunPanda commented Feb 7, 2023

Uh oh!

ShogunPanda commented Feb 10, 2023

Uh oh!

mcollina left a comment

Choose a reason for hiding this comment

Uh oh!

marco-ippolito commented Feb 15, 2023

Uh oh!

ronag commented Feb 15, 2023

Uh oh!

nodejs-github-bot commented Feb 20, 2023

Uh oh!

nodejs-github-bot commented Feb 20, 2023

Uh oh!

nodejs-github-bot commented Feb 21, 2023

Uh oh!

nodejs-github-bot commented Feb 21, 2023

Uh oh!

nodejs-github-bot commented Feb 21, 2023

Uh oh!

ArsalanDotMe commented Dec 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

marco-ippolito commented Feb 6, 2023 •

edited

Loading

climba03003 left a comment •

edited

Loading

climba03003 commented Feb 6, 2023 •

edited

Loading

climba03003 commented Feb 6, 2023 •

edited

Loading

climba03003 commented Feb 6, 2023 •

edited

Loading

marco-ippolito commented Feb 6, 2023 •

edited

Loading