Skip to content

Fix: strip UTF-8 BOM from .env files#640

Open
h1whelan wants to merge 1 commit intotheskumar:mainfrom
h1whelan:fix-bom-handling
Open

Fix: strip UTF-8 BOM from .env files#640
h1whelan wants to merge 1 commit intotheskumar:mainfrom
h1whelan:fix-bom-handling

Conversation

@h1whelan
Copy link
Copy Markdown

Summary

Fixes #637

When a .env file is saved with a UTF-8 BOM (Byte Order Mark), the BOM character (\ufeff) is prepended to the first variable name, making it silently inaccessible via its intended key. This is a common issue with JetBrains IDEs on Windows, which often save files with BOM encoding by default.

Before:

dotenv_values("file_with_bom.env")
# {'\ufeffFIRST': 'value1', 'SECOND': 'value2'}  ← first key is corrupted

After:

dotenv_values("file_with_bom.env")
# {'FIRST': 'value1', 'SECOND': 'value2'}  ← all keys are correct

Changes

  • src/dotenv/parser.py: Strip the UTF-8 BOM (\ufeff) from the beginning of the stream content in Reader.__init__ using str.removeprefix().
  • tests/test_parser.py: Added two test cases for BOM handling — single variable and multiple variables with BOM prefix.

Test plan

  • All 45 parser tests pass (43 existing + 2 new)
  • All 114 main tests pass
  • Verified fix with actual BOM-encoded .env file
  • Verified no regression for files without BOM

🤖 Generated with Claude Code

… loss

When a .env file is saved with a UTF-8 BOM (common with JetBrains IDEs
on Windows), the BOM character (\ufeff) was prepended to the first
variable name, making it silently inaccessible via its intended key.

Strip the BOM in Reader.__init__ so all variables are parsed correctly
regardless of whether the file contains a BOM.

Fixes theskumar#637

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

First environment variable in a .env file is silently ignored on a file with BOM (Byte Order Mark) format

1 participant