TARDIS Placeholder

How I discovered Open Source

We take Open Source software for-granted today. Statistically, there's a good chance that you are reading this article in a web browser based on Chromium, the open source browser that Google markets as Chrome, or that uses Webkit, connected to this website using security powered by SSL, transported over HTTPS and TCP/IP, using DNS to translate a domain name into an IP Address. The page is written in HTML and uses CSS and Javascript, all open standards.

And, if only a fraction of that made sense to you, that's OK. The point is that we all make use, every single day, of software that has been built and distributed free of charge by armies of volunteer developers. Whilst today many of these developers work on projects funded by large corporations or through charities and organisations funded by donations, the foundations of the internet as we know it today were laid by computer scientists and developers who gave their work away for free for the betterment of us all.

Fortunately for us all, when Sir Tim Berners Lee created the "World Wide Web", he didn't try to copyright it (although he did sell an NFT of the original source code for several million dollars recently). The web was created to be a set of open standards and, as a consequence, no single person or organisation controls the internet in its entirety today. Lots of individuals, companies and governments own certain bits of it, but nobody controls it all.

I first became aware of Open Source whilst working in the NHS where I was fortunate enough to get partnered up with a software developer who was a strong advocate for Open Source and who opened my eyes to the incredible community that I'd been ignoring in my (to that point) very Microsoft-centric development career.

Back when I was introduced to Open Source, it had a distinctly disruptive, rebellious, counter-culture vibe. Microsoft, the biggest software company in the world back then, hated Open Source.

Linux is a cancer that attaches itself in an intellectual property sense to everything it touches. That's the way that the license works.

  • Steve Ballmer, Microsoft

As I started working with Open Source technology, Microsoft became the enemy. It felt natural as, to them, we were the enemy and some of my experiences with Microsoft account managers/lobbyists whilst I was working with the NHS only reinforced this. It became... a little personal.

But, over time, things changed. Microsoft changed its position and today... Microsoft loves Open Source.

Microsoft hearts Open Source

In the 2010s, as the IT industry turned away from the desktop and towards the cloud and under the stewardship on new CEO Satya Nadella, Microsoft open-sourced the .NET Framework, made significant investments/donations in the Linux Foundation and Open Source Initiative, and became a major contributor to the Linux Kernel. Azure, Microsoft's exceptionally popular hosting service, runs Linux-based operating systems as "first class citizens". By 2017, it seemed the war was over. We'd won.

Then, in 2018, Microsoft acquired GitHub - the biggest source code host for open source projects in the world and became one of the site's most active contributors. Whilst the acquisition initially caused a few projects to migrate away from GitHub by 2019 there were over 10 million new users of GitHub.

Had Microsoft learnt the error of their ways?

No, maybe not.

Embrace, Extend, Extinguish

"Embrace, extend, and extinguish" is a phrase that the U.S. Department of Justice found was used internally by Microsoft to describe its strategy for entering product categoriesinvolving widely used standards, extending those standards with proprietary capabilities, and then using those differences in order to strongly disadvantage its competitors.

The price of freedom is eternal vigilance

  • Thomas Jefferson

Lately, having used its financial might to gain significant control and influence over the Open Source space, Microsoft seems to have brought this strategy back to life and moved very quickly into the extinguish phase when it comes to Open Source.

Microsoft tries to cancel Open Source software in the Microsoft Store

On June 16th, Microsoft updated the policy for the Microsoft store to include the following restriction for all software sold:

“to prohibit charging fees in the Store for open-source or other software that is generally available for free”

This presented a major problem to Open Source projects that rely on money generated through the Microsoft store. Microsoft effectively choked off this income with the change to their store policy. The outcry was immediate, with Bradley Kuhn of the Software Freedom Conservancy writing an excellent article discussing why this move from Microsoft was problematic.

To be clear on this, Open Source was never meant free of charge by default. In fact, the best definition of Open Source being free is that it meant free as in freedom not free as in free beer. You can charge for an open source application, within the bounds set by the license. There's also no limit on the amount you can receive as donations, Patreon subscriptions, Ko-Fi payments, etc. However, its extremely unusual for the software to not be available for free in some form and this meant that Microsoft's policy change covered all Open Source projects.

Perversely, the business probably making the most money out of Open Source code right now is Microsoft and irony of this was not lost on many commentators.

Thankfully, following the widespread outcry from the software development community, Microsoft reverted their policy just two days later. Vigilance and a loud community voice had paid off. For now.

Enter CoPilot

Poisoning Open Source business models isn't the only problematic thing that Microsoft have been doing of late. Microsoft recently unveiled Copilot, an AI service that can automatically generate source code for developers.

What did they use to train this AI? Well, according to the Software Freedom Conservatory, Microsoft trained Copilot on Open Source code from Github.

If this is the case, then this is very clear violation of Open Source licenses that require, in the vast majority of cases, that credit is given to original software authors if you include or redistribute their code as part of your project. Equally, and in a direct reference back to Steve Ballmer's worrying comparison of Linux to cancer, if you use Open Source code with certain licenses (notably the GNU Public License for example) your product automatically becomes Open Source. The point is (was) to stop unscrupulous developers taking large amounts of Open Source code and using it to create closed-source proprietary products.

Copilot circumvents this by creating new code based on the old code. Whilst human beings arguably have been doing this since programming began, looking at existing programs as a reference to learn how to program, Copilot does this on an industrial scale and gives no credit to the original authors.

The core questions being asked by SFC (and the wider community) are simple:

  1. What case law, if any, did you rely on in Microsoft & GitHub's public claim, stated by GitHub's (then) CEO, that: “(1) training ML systems on public data is fair use, (2) the output belongs to the operator, just like with a compiler”?
  2. If it is, as you claim, permissible to train the model (and allow users to generate code based on that model) on any code whatsoever and not be bound by any licensing terms, why did you choose to only train Copilot's model on FOSS? For example, why are your Microsoft Windows and Office codebases not in your training set?
  3. Can you provide a list of licenses, including names of copyright holders and/or names of Git repositories, that were in the training set used for Copilot? If not, why are you withholding this information from the community?

How does this "extinguish" Open Source? Very simply, if new propriety code can be generated using Open Source code as a reference point then it becomes possible to generate "replacements" for popular Open Source products without violating copyright and without having to commit massive amounts of programming hours to creating the new product.

Like a cheap photocopier, Copilot could eventually turn out reasonable replicas of Open Source projects that are sufficiently functional to compete against the original.

Embrace... Extend... Extinguish

Should you leave Github?

Should you leave Github? The FSC certainly think so..

Personally, I think the jury is out both on the utility of Copilot (more on that in a another blog post) and, in a more literal sense, there will be ongoing legal battles for the foreseeable future between Microsoft and the organisations set up to protect Open Source and Free Software about just what Microsoft can and can't do with your code, whether they own Github or not.

There can be no doubt, however, that Open Source code made the Internet as we know it today. Flawed, imperfect, but also free. Imagine a world where the Internet is run by Microsoft, or Google, or Facebook. It's not a pretty picture.

Open Source remains worth fighting for and the war, it seems, is far from over.

Site Powered By:  Kirby, Bootstrap 5, Masonry