
A Russian security group has posted a detailed blog post (translation here) about how they managed to extract the source code to over 3,300 websites. The group found that some of the largest and best known domains on the web, such as apache.org and php.net, amongst others, are vulnerable to an elementary information leak that exposes the structure and source of website files. A web surfer is able to extract this information by requesting the hidden metadata directories that popular version control tool Subversion creates.
The actual ‘exploit’ itself has been well known for a long time. It is the fault of the server administrator or developer, rather than the fault of a particular application, since the working metadata directories in Subversion are only required for working copies of code. What is surprising is just how prevalent the problem is – and who it affects. Finding version control metadata directories is as simple as looking for ‘.svn’ or ‘.cvs’ folders within web paths, for example: http://www.test.com/.svn/.
The metadata directories are used for development purposes to keep track of development changes to a set of source code before it is committed back to a central repository (and vice-versa). When code is rolled to a live server from a repository, it is supposed to be done as an export rather than as a local working copy, and hence this problem.
Most web servers are configured by default to disallow access to directories that begin with a period (the traditional prefix for a hidden file or folder in UNIX) – which makes this problem more embarrassing for the affected sites as not only have they mismanaged their version control, but have somehow managed to disable the standard safeguard in webservers meant to prevent hidden files and folders from being returned to users.





The google translate link is set to convert from english to russian not russian to english
fixed that, thanks.
If your site suffers from this issue, the fix is documented @ http://stackoverflow.com/questions/214886/how-do-i-hide-directories-in-apache-specifically-source-control
thats the wrong fix. the right fix is to learn how to use version control properly (ie. export over checkout)
+1
Or to use Microsoft version control tools.
Sorry, couldn’t resist!
+1! the fix is to patch the brain, not the software
Not entirely true:
Protecting .svn directories is no harder than protecting other sensitive files.
And using a working copy directly on the web server has advantages e.g. using the ’switch’ command is much faster than doing a full export each time.
That’s why so many people do it!
++
Using a revision control system (rather than export) to roll releases provides fine-grained release, rollback, and branch switching that can be extremely painful otherwise with large code bases.
export + rsync
comes down to methodology/preference
The software should work for the humans, not the other way around.
Adding the four lines from the link fixes this problem:
Order allow,deny
Deny from all
In academic theory svn export would be “right” but you loose a lot of pragmatic opportunities to make things work easily in the real world.
I run a tiny but profitable operation with frequent updates to the code base and live site. We could hire a “build engineer” to manage this stuff for $80K a year, or we could add four lines to the conf file. Which one is “the right fix”?
1) There is nothing wrong with using a checkout in production. As others have pointed out, it makes rolling revisions much easier.
2) Most web servers are not configured by default to block access to hidden dirs.
3) Anyone who knows anything has known about this for years.
4) The fix is stupid simple.
This is a very well-known issue, and it shocks me that major website administrators would not have protected against this. This was one of the first things I discovered when starting to admin my own servers.
Even something as simple as URL-Rewriting can eliminate this problem, but ideally any major website has a deployment process that should be using rsync and/or svn export, rather than svn checkout or svn up to update the code across multiple servers.
An exception might be development servers, but those servers should be actively secured behind firewalls.
Actually, the default of Apache is not to hide files that begin with a ‘.’. Rather, a rule is put in place to hide files that start with ‘.ht’ (such as .htaccess files).
I can’t think of a single server app that will automatically hide all ‘.’ prefixed files by default. Apache, IIS and lighhttpd will all show ‘.’ files by default.
Hahaha, oops. I’m glad I’m not one of those site admins, because that would be incredibly embarrassing. Especially since, as noted, by default Apache’s configuration says NO to requests for dot directories.
OK, someone has accessed the source code of open source projects? Where is the news?
No. They accessed the source code of the website of an open source project… in the examples listed. Those websites may be fully copyright as well, and the source code may reveal sensitive information on how to get inside the administrative areas of the site, potentially opening up a gateway for a malicious person to… you get my tin-foil-hat point.
There are plenty of people and organizations that use open source SCM internally with closed source or proprietary software and website applications, and those may be just as vulnerable to this mis-configuration.
Also, the default rule for Apache (at least apache22 on my FreeBSD 7.0 box) is:
<FilesMatch “^\.ht”>
Order allow,deny
Deny from all
Satisfy All
</FilesMatch>
Or maybe they just assume they’re going to get hacked and are careful to put all sensitive information outside document root. Surely it’s only embarassing if sensitive information was revealed about apache.org, php.net, etc? Maybe they don’t care either way? You don’t have to be paranoid when you’re an open-source project with nothing to hide.
Man these Russkies could have coded their way to winning the cold war. Could thing they picked working on nukes.
http://www.techcrunch.com was listed! J/k =)
This sound to me more a problem in the deploy script rather than in the use of SVN.
I don’t think they used SVN checkout to put stuff on production servers!
More likely, they use SVN locally and then delegate the deploy process to a script like rsync but with the wrong arguments, so including also .SVN hidden directories in the copy process.
Apache.org & php.net might not the best examples since they are open source projects. I guess they really don’t care that anyone can see the source code of their website all be it not on purpose.
On the php.net website for example, there is a link called ’show source code’ on the bottom of every page…
You might get the impression that this leak reveals all the source code for a website. Correct me if I’m wrong, but it’s my impression that you can only get the source code from scripts which are in the public directory of the web-server.
Properly setup websites have most of their code not in the public directory. Especially the sensitive parts.
By “not in the public directory” I presume that you mean outside of the default DocumentRoot setup for the server, so even if you have a .svn directory laying around it wouldn’t reveal the code for any business objects.
In many cases things like config.php files are ALWAYS stored somewhere within the DocumentRoot, and not in some libs directory, and they might contain hard-coded values with details on how to access your databases, etc.
And furthermore, storing things in different directories, makes maintaining your SVN repository much more difficult. While a valid approach is to create and maintain different “packages” (conf, static, lib, etc), the more common usage is to deploy the entire repository as your wwwroot, so you don’t have to manage multiple deployments.
It all comes down to laziness. Lazy administrators use default settings which are almost always unprotected (used so often for open-source development), or don’t bother to implement security policies for public systems separate from their development systems.
So what you are saying is that it is probably acceptable to be lazy and not maintain a config package but maintain the utmost security on your DocumentRoot (your third para) but it’s unacceptable to maintain a config package and not really care about people accessing your DocumentRoot (in cases like php.net where you’re making the PHP code available anyway)? Of the two, the second option actually sounds better to me because the first assumes that your security will be impenetrable and the second assumes it’s not.
Ouch. Embarrassing.
uh…. svn export?
Export gigs of stuff 10 times a day to a hdundred servers?
No, THANKS
ouch!!!
Some of the blame is on languages like PHP that mix content with script. A Python-based site would not suffer from this.
One word: the forced indentation of code. There’s easier and more secure languages to code in than Python.
You have no idea what you are talking about. LOL.
true, true
This has nothing to do with mixing content and script. This is poor process, and it plagues every language.
Dave,
What? Source code is source code. Python scripts would be just as accessible as php scripts. Makes no difference.
Wow that’s amazing
’scuse me? Any fool who doesn’t put their repositories off of doc-root deserves to be hacked. This story ls like saying ‘don’t put a big file called sourcecode.txt in your webroot directory.’
Guess I’m glad I use .NET for all my websites. IIS prevents this stuff by default.
I wrote about this back in January: http://www.adamgotterer.com/2009/01/26/hacking-the-svn-directory/
Looks like your link for apache.org needs a http://
“Most web servers are configured by default to disallow access to directories that begin with a period”
Which ones? Apache and nginx do NOT do that by default.
Besides CVS directory is called CVS/, not .cvs/
So that is why people suggest not placing the root directory of your website and user rewrite rules to dispatch the content. That way you are not exposing the details of the website. Or, use git, as it is easier to patch by just deleting the .git (without ’s’) folder on deployment server.
Note that for php.net, this is not an issue. We have always had a “show source” button on every one of our pages, and our svn tree is up for public display. The few mirrors that are misconfigured this way are not revealing anything that isn’t already extremely public. There are no passwords or anything sensitive to be found in there.
The Techcrunch article is as embarassing as the affected “Sys Admins”.
Protection of directories which start with a dot by default? Where? You are confusing it with .htaccess, as someone pointed out earlier. And ‘.cvs’ may exist somewhere on your disk, but it’s certainly not part of CVS. Do your homework, before you make fun about big websites.
Secondly, the suggested option (somebody’s comment) to use ’svn export’ may work for smaller projects. But for deployment of a big codebase on multiple servers, this is not an option. At least not when you need the option of rolling back easily and being able to tell for any random file, which version it is, and which changes it has gone through.
Anyway, that’s not a problem, because the “security fix” (I would call it basics of apache administration) works brilliantly. Matter of taste how you populate your production docroots, as long as you know what you’re doing, and what the consequences are.
@Techcrunch: just in case somebody discovers a “massive flaw” in Mercurial, too: ‘.hg’ isn’t protected by default either.
The only security issue about this alleged SVN vulnerability (as people call it on Twitter) is the systems administrator, who didn’t do their job properly. That’s basics, so he probably doesn’t even deserve the title.
Known issue from years. ush.it
This is terrible. Nikto has detected this for a long time. Why are we still fixing issues that should of been fixed along time ago?
g0l4Na I want to say – thank you for this!
Russian guys already downloaded thousand .gov .com sites.