← Productivity
D

GitHub

Serious concerns
Microsoft · 🇺🇸 United States
PolicyApp PermissionsNetwork TrafficFirmwareRegulatory
Technical details
Manufacturer: Microsoft

The bottom line

GitHub promised not to train Copilot on your private code. But a class-action lawsuit showed Copilot reproducing verbatim chunks of open-source code — including copyright notices and licence text — without attribution. If the AI memorised and regurgitated GPL-licensed code, it violated the licence. If it did the same with your private code, you'd never know. The training data went from GitHub to OpenAI to Copilot. GitHub controls what goes in. OpenAI controls what comes out. You control nothing in between. You write code in VS Code. Copilot sends it to OpenAI for completion. OpenAI runs on Microsoft Azure and now AWS. Microsoft owns GitHub and 27% of OpenAI. Your proprietary code passes through three companies with $185 billion in mutual financial obligations. A developer writing trade secrets has their code context flowing through a code host, an AI company, and a cloud provider — all connected by ownership and investment. They promised the platforms are separate. The money says they aren't.

Legal jurisdiction
🇺🇸 United States (headquarters)
CLOUD Act read more →
US govt can demand your data from this company even if stored overseas
FISA §702 / PRISM read more →
NSA collects stored emails, photos, messages without individual warrants
Geofence warrants read more →
Police can demand location data for everyone near a crime scene
Spying
0/4 N/A
Is someone spying on me?
Data Sharing
1/4 LOW
Who gets my data?
Security
2/4 MODERATE
Is it actually secure?
Honesty
2/4 MODERATE
Can I trust what they say?
ACCEPTABLE Moderate concerns. Standard privacy hygiene applies.
4Contradictions
0Critical
3High
1Medium
5Sources
Findings by concern
Data Sharing 1/4 LOW 1 finding
⚡ highpolicy claim vs third party research
GitHub promised not to train Copilot on your private code. But a class-action lawsuit showed Copilot reproducing verbatim chunks of open-source code — including copyright notices and licence text — without attribution. If the AI memorised and regurgitated GPL-licensed code, it violated the licence. If it did the same with your private code, you'd never know. The training data went from GitHub to OpenAI to Copilot. GitHub controls what goes in. OpenAI controls what comes out. You control nothing in between.

What they claim: GitHub states "we do not train GitHub Copilot on private repository code" in its privacy documentation.

What we found: In November 2022, a class-action lawsuit (Doe v. GitHub) was filed alleging GitHub Copilot reproduced substantial portions of licensed open-source code without attribution, violating open-source licences (GPL, MIT, Apache). Researchers demonstrated Copilot generating verbatim code snippets including copyright notices and licence headers from training data. While GitHub claims private repos are excluded, the boundary between public and private training data has been questioned — especially since GitHub has access to all repository data on its platform, and OpenAI (which trains Copilot's models) received the training dataset from GitHub.

Security 2/4 MODERATE 2 findings
⚡ highpolicy claim vs regulatory finding
Iranian developers woke up and couldn't access their own code. Years of work, locked behind US sanctions — on a platform that calls itself global and neutral. In 2024, Palestinian developers had accounts suspended for unspecified "terms of service violations." GitHub reversed it after the internet noticed. Security researchers had their proof-of-concept exploits removed via DMCA takedowns. When your code lives on a US-owned platform, US sanctions decide if you can access it, US politics decide if your account survives, and US copyright law decides if your security research stays up. Neutral is a marketing term.

What they claim: GitHub positions itself as a neutral platform for all developers.

What we found: In 2019, GitHub restricted access for developers in Iran, Syria, Crimea, Cuba, and North Korea citing US trade sanctions. Developers lost access to private repositories containing years of work. In 2024, GitHub suspended accounts of developers contributing to Palestinian open-source projects, citing unspecified "terms of service violations" — later reversed after backlash. GitHub also complies with DMCA takedowns that have been used to remove security research, including the removal of the ProxyLogon proof-of-concept exploit. A code platform under US jurisdiction means US sanctions, US politics, and US copyright law determine who can write software.

⚫ mediummarketing claim vs third party research
Copilot makes you code faster. Stanford found it also makes you code worse. Developers using AI assistance introduced more security vulnerabilities — SQL injection, cross-site scripting, the classics. Worse: they were more confident the code was secure. Researchers called it the overconfidence effect. You write faster, review less, and ship bugs you wouldn't have written by hand. Copilot optimises for velocity. Security optimises for paranoia. These are different goals.

What they claim: GitHub markets Copilot as a productivity tool that helps developers "code faster."

What we found: Stanford researchers found that code written with AI assistance contained more security vulnerabilities than code written without it. Developers using Copilot were more likely to introduce SQL injection, XSS, and other OWASP Top 10 vulnerabilities because the AI generated plausible-looking but insecure patterns. A separate study found developers using AI assistants were more confident in their code's security while actually producing less secure output — a phenomenon researchers called "the overconfidence effect." Copilot optimises for speed, not safety.

Honesty 2/4 MODERATE 1 finding
⚡ highpolicy claim vs network analysis
You write code in VS Code. Copilot sends it to OpenAI for completion. OpenAI runs on Microsoft Azure and now AWS. Microsoft owns GitHub and 27% of OpenAI. Your proprietary code passes through three companies with $185 billion in mutual financial obligations. A developer writing trade secrets has their code context flowing through a code host, an AI company, and a cloud provider — all connected by ownership and investment. They promised the platforms are separate. The money says they aren't.

What they claim: GitHub's privacy statement says it collects data "to provide, improve, and develop" its services.

What we found: GitHub sends telemetry to Microsoft. GitHub Copilot sends code context to OpenAI's API for completion. This means your code passes through: GitHub (Microsoft) servers, OpenAI's inference infrastructure, and potentially AWS (where OpenAI now deploys via Bedrock). A developer writing proprietary code in VS Code with Copilot enabled has their code context flowing through three companies — all part of the Microsoft-OpenAI-AWS convergence. Microsoft owns GitHub and 27% of OpenAI. The code hosting platform, the AI model, and the cloud infrastructure are financially entangled.

What happened to real people
Documented incidents involving Microsoft products and user data.
First PRISM participant (2007). 31% of US legal demands come with secrecy orders — 1,974 gag orders in H1 2025 alone. Users never told their data was demanded. [source]
Storm-0558: Chinese hackers used a stolen Microsoft signing key to access US government officials' email accounts. Microsoft's own infrastructure was the attack vector. [source]
What your data is worth to governments
Microsoft complied with 6,288 government data requests in H1 2025. That's 31% of demands include secrecy orders. Microsoft has been a confirmed PRISM participant since 2007. Under this programme, the NSA collects stored communications. The company is legally prohibited from telling you. Jurisdiction: US (CLOUD Act, FISA Section 702, Patriot Act).
Documented: First PRISM participant (2007). 31% of US legal demands come with secrecy orders — 1,974 gag orders in H1 2025 alone. Users never told their data was demanded.
Documented: Storm-0558: Chinese hackers used a stolen Microsoft signing key to access US government officials' email accounts. Microsoft's own infrastructure was the attack vector.
What is PRISM? · What is the CLOUD Act? · Transparency report
Sources