A Russian security group has posted a detailed blog post (translation here) about how they managed to extract the source code to over 3,300 websites. The group found that some of the largest and best known domains on the web, such as apache.org and php.net, amongst others, are vulnerable to an elementary information leak that exposes the structure and source of website files. A web surfer is able to extract this information by requesting the hidden metadata directories that popular version control tool Subversion creates.
The actual ‘exploit’ itself has been well known for a long time. It is the fault of the server administrator or developer, rather than the fault of a particular application, since the working metadata directories in Subversion are only required for working copies of code. What is surprising is just how prevalent the problem is – and who it affects. Finding version control metadata directories is as simple as looking for ‘.svn’ or ‘.cvs’ folders within web paths, for example:
The metadata directories are used for development purposes to keep track of development changes to a set of source code before it is committed back to a central repository (and vice-versa). When code is rolled to a live server from a repository, it is supposed to be done as an export rather than as a local working copy, and hence this problem.
Most web servers are configured by default to disallow access to directories that begin with a period (the traditional prefix for a hidden file or folder in UNIX) – which makes this problem more embarrassing for the affected sites as not only have they mismanaged their version control, but have somehow managed to disable the standard safeguard in webservers meant to prevent hidden files and folders from being returned to users.