You are browsing the archive for Data Journalism.

Help map spending projects around world!

Anders Pedersen - December 1, 2013 in Contribute, Data Journalism

OpenSpending data is used in projects across the globe by students, journalists, activists, news media, governments, and others. Browse the map to discover some of these exciting projects, and get ideas for your own spending tracking projects.

News Editor Anna Flagg has now created an upgraded map of spending projects from around the world. The map includes investigative projects, budget monitoring groups and financial transparency initiatives. We need your help to expand the map with some of all the new and fantastic projects out there. Here is a quick guide for how you can add one or more projects in less than five minutes:

1) Go to the google doc of spending projects, which functions as the “backend” of the map. Ask for approval to become editor in the right corner [the blue coloured “share” button]

2) Add a financial project with information about the country, the URL for the project and other details. If you are unsure about how to add the information, get in touch via the OpenSpending mailing list

3) Share your contribution to the map by tweeting and emailing the mailing list!

Made by Anna Flagg using D3.js.

BudgIT: An App and Platform to track service delivery

Oluseun Onigbinde - November 27, 2013 in community profile, Data Journalism

I swim in an innovation pool that boasts of developers, interface designers, programmers and entrepreneurs called Co-Creation Hub. Located in the emerging tech district in Lagos, Co-Creation Hub is  the geek’s nest and the innovator’s hub. I am a faithful of that academy that picked me up from the bank’s cubicle to lead BudgIT, the double helix of civic awesomeness.

BudgIT strives hard to make budget access easy for everyone. Few months ago, we had our website with a messy backend and our mobile platforms were grossly underperforming.  The only bright spot was our Twitter page which was a restless stream of tweets on public finance and other related data. Working with a group of amazing geeks in Co-Creation Hub, BudgIT has been able to fix it platforms with every state having its personalized page which include infographics, interactive applications and relevant data.


Our mobile platform experience is different from the web and we have gone a mile further to release Android and Blackberry Apps. We have gone ahead to redefine the narrative on our mobile platforms with  quick access to the budget, projects and monthly allocations. We have also released our apps on the Blackberry and Android platforms,  which are the most common within our environment.  This is a way of reaching to young and urban Nigerians who have access to smartphones and can lead the national discussion on budgets and public data.

Screenshot of BudgIT Android App

Screenshot of BudgIT mobile site (

We have decided to go from a budget access to budget tracking platform. This new offering of ours is known as Tracka. Tracka is a social platform of active citizens who are interested in tracking budgets and public projects in their community. Layered on open data and also integrated with existing social media tools, this platform will bring people of common interests together to share photos, videos, documents and also post comments on existing projects. This has the power extend the use of open data to the larger society who earnestly yearn for improved government services. There is a strong wish from our end to integrate Tracka with OpenSpending, the visualization platform of the Open Knowledge Foundation.

Tracka: Our Public Projects Platform

Our goal is to amplify the voices by shining light in corners less understood. Most people are not aware of projects in the neighbourhood and don’t ask the right questions or connect to institutions you are suppose to implement the projects. BudgIT wants to lead that conversation based on facts that the budget is a promise to the citizens. Performance can only be ensured when the citizens and government are on the same page. We want to match the right communication tool to every citizen and ensure that they become active citizens by actively demanding what is right for them. The next push will be spike the kind of viral marketing, the kind the Nigerian ecommerce platforms put forward in their early days. As we go on the path of democracy, every voice must count.

Who should control the budget?

tarikn - November 5, 2013 in community profile, Data Journalism

Who is really ruling the country? Is it the political party with the most ministerial seats, or the one with the most influential ones? And how do we measure the relative weight of a ministry?

Morocco’s government, as in many countries around the globe, consists of a coalition of political parties. Unlike the United States, no single party controls the government alone, which leaves the opportunity for negotiations and maneuvers to split the ministerial seats. For example, the Istiqlal Party recently began criticizing the draft of the 2014 budget law, despite having earlier authored it before deciding to leave the government and switch to the opposition.

In the political lingo, democracy is to many people synonymous to ‘number of seats’. We are becoming familiar with terms such as ‘majority’ or ‘minority’ based on the number of ‘seats’ of the parliament. Last month, the Moroccan Head of Government, Mr. Benkirane, negotiated the formation of a new coalition government. His party, PJD, kept most of the ministerial seats since they were the majority party. Yet it was disturbing to analyze the budget data distributed by ministers’ political party affiliation: Only 8% of the budget spending is directly controlled by the Ministers of the majority political party. More than half of the budget is under ministers with no political affiliation (e.g. Ministry of Interior, Ministry of Education, Ministry of Agriculture).

One would wonder if the citizen should have any voice on who shall control the budget. Today, we are asking citizens to get engaged on democratic practices, vote for the best profiles and programs, and accept the rule of the majority. Though the majority party has most of the ministerial seats, they have no power over the economic agenda. Is this a matter of concern? Are there any international practices for tying the citizens’ vote with budget control?

The budget discussion raises new questions about the meaning of representative democracy. These questions and others are warranting the distrust of politics in Morocco and the call for more transparency and accountability.

Sevilla Presus: Data-driven journalism at municipal level

J. Félix Ontañón - October 1, 2013 in Data Journalism, Spending Stories

A such a disruptive technology as Internet, is forcing us to re-think the role and methods of many professions. Old models hasn’t died yet, but in the new ones we can find some common patterns for success: empowering people through a community for cooperation. Journalism isn’t the exception. Through the The Guardian Data Blog, as an example, many citizens helped to transcribe and find stories in the MP’s expenses data: people give their eyes, The Guardian gives the platform.

Sevilla Actualidad and Sevilla Report, both local newspapers in Sevilla (Spain), are using two OKFN tools empower their fellow citizens in the spirit of The Guardian Data Blog. With the help of OpenKratio, a group of citizens for fostering the Open Government Data culture in Spain, we’ve launched #SevillaPresus13, a Crowdcrafting app to crowdsource the transcription of municipal budgets.


We aim to complete a set of visualizations on OpenSpending for the 2011-2013 municipal budget series (2012 was done), so it would be available for everyone to embed them in their own web posts. The plan is to link the budgetary information with the municipal public procurement. This way both local newspaper will have a powerfull tool to monitorize the municipal activity and finding insteresting stories to tell. This project has been accepted into a data-journalism contest in Madrid (Spain).

Crowdcrafting is an amazing platform to build crowdsourcing apps for transcribing document and images into machine-readable data. As it provides some out-of-the-box pdf transcribe apps, all you need to do is download, customize and deploy for your own proposes. In the case of PDF files, tools as Tabula are improving the way non-techie people can unlock the information, but only Crowdcrafting is able to develop an engaging crowd-experience for users.

OpenSpending News Round-up, August 12

Teodora Beleaga - August 12, 2013 in Data Journalism, Kathmandu, Nigeria, Round-ups, Spending Parties, Spending Stories, Updates

Fiscal transparency never sleeps, and neither does the OpenSpending community. To keep track of all happenings across the open spending spectrum, we’re rounding up on latest blogs, stories and datasets each week. But we’re only human, so if we miss anything, give us a nudge at info [at] openspending [dot] org.

Updates from the community

OS Capture

Last week saw several accounts of the City Spending Data Party we hosted in July, 19-21. Prakash Neupane shared the systematic way in which our community in Kathmandu, Nepal approached the city’s expenditure data, in addition to the more detailed account available on the Nepalese OKFN website. Read the rest of this entry →

Opening up municipal spending data

Niels Erik Kaaber Rasmussen - July 11, 2013 in Coverage, Data Journalism, Spending Stories

The open data web-agency Buhl & Rasmussen has developed a site visualizing the budgets of all 98 Danish municipalities for one of the biggest Danish news sites, Politiken.

Municipalities are central for the functioning of the welfare state Denmark. They take care of a range of important tasks like social- and health care, primary education, social benefits, traffic and much more. However even in a year with local elections they do not attract much public attention.

One reason to this might be that the barriers for ordinary citizens to engage in local politics are too high. One way to lower the barriers might be to make it easier to understand the most central decision made by municipalities each year: their budgets.

The data

There are comprehensive data openly available on Danish municipal budgets and accounts. Budgets and accounts are structured in a hierarchy with 4 levels and roughly 250 possible expenditure and income posts. Data for all 98 municipalities can be obtained from Statistics Denmark and dates back to 1978. Obviously a lot of changes has taken place since 1978 but historical data for the last 5 years or so are reasonable comparable with today’s figures.

Reducing complexity – preparing the data

Multiple problems arise when comparing historical accounts to the latest budget, as I decided to do for this project. First inflation must be taken into account. Secondly the responsibilities of the municipalities are not fixed over time and thirdly accounting practices changes over time.

I choose to adjust time series using a combined consumer price and wage index. While this does not fix the problem with changing responsibilities and accounting practices it improves the overall comparability.

Dealing with national reimbursements

Some expenses paid for by local government are reimbursed in part by national government. Shall the reimbursed part be included as an expense or not? I decided to do both(!). When presenting the budget using the bubble diagram all figures represents expenditure minus related income if any. In addition to the bubble diagram a table view shows the budgets with income and expenditures separated from each other.

In the budgets expenditure posts are split in operational and construction expenses. The separation makes good sense for a lot of analyses however for simplicity I choose not to differ (the separate parts of an expenditure post are shown with mouseover).

I made an effort to include not just spending data but also figures describing the different sources of income. Openspending makes such a good case in highlighting public spending but seems to ignore income. To get a reasonable understanding of municipal budgets you’d have to consider both spending and income.

The solution

The main visualization is the well-known bubble chart from OpenSpending slightly altered to include an information box, when users click at the lowest level. The popup includes information on historical spending, related income and presents the expenditure as part of the total budget, divided by the number of inhabitants and as compared to national average.

Information popup showing historical data, expenditure as part of the total budget, expenditure per inhabitants and expenditure compared to national average.

The bubble chart itself is an inspiring tool – thanks to the Open Knowledge Foundation and the OpenSpending-project for providing it.

In addition to the bubble chart I created some features useful when dealing with multiple budgets with same structured format. For users interested in a specific expenditure post, that post can be selected and a list of the municipalities spending the most/least per inhabitant (of in percent of the total budget) can be shown.

The standardized structure of the budgets also made it possible to build a function to compare two different municipalities in details.

Comparing two municipalities. The small charts shows historical data.

The compare-feature makes it easy for one to see how the neighbour municipal prioritize differently.

Municipal elections are going to be held in Denmark later this year. Hopefully the opening of the budgets will help lowering the barrier for citizens to engage in discussions.

Local spending in your city

While data on budgets and accounts for municipalities in Denmark are relatively easy to obtain this is far from the case in every municipal or city around the world. Some places the data is not published at all, many places the data is not published in a machine-readable format and seldomly is data made available in a manner that makes it possible to compare one city to another in a meaningful way.

The OpenSpending project is organizing a City Spending Data Party on July 19-21 – if you’re interesting in local spending data this is a great chance to get involved.

How Spending Stories Fact Checks Big Brother, the Wiretappers’ Ball

Lucy Chambers - February 24, 2012 in Data Journalism, Spending Stories

This piece was co-written with Eric King of Privacy International and comes as Privacy International launches a huge new data release about companies selling surveillance technologies. It is cross-posted on the MediaShift PBS IDEA LAB

Today, the global surveillance industry is estimated at around $5 billion a year. But which companies are selling? Which governments are buying? And why should we care?

We show how the OpenSpending platform can be used to speed up fact checking, showing which of these companies have government contracts, and, most interestingly, with which departments…

The Background

Big Brother is now indisputably big business, yet until recently the international trade in surveillance technologies remained largely under the radar of regulators and civil society. Buyers and suppliers meet, mingle and transact at secretive trade conferences around the world, and the details of their dealings are often shielded from public scrutiny by the ubiquitous defence of ‘national security’. Perhaps unsurprisingly, this environment has bred a widespread disregard for ethics and a culture in which the single-minded pursuit of profit is commonplace.

For years, European and American companies have been quietly selling surveillance equipment and software to dictatorships across the Middle East and North Africa – products that have allowed these regimes to maintain a stranglehold over free expression, smother the flames of political dissent and target individuals for arrest, torture and execution.

They include devices that intercept mobile phone calls and text messages in real time on a mass scale, malware and spyware that gives the purchaser complete control over a target’s computer and trojans that allow the camera and microphone on a laptop or mobile phone to be remotely switched on and operated. These technologies are also being bought by Western law enforcement, including small police departments in which the ability of officers to understand the legal parameters, levels of accuracy and limits of acceptability is highly questionable.

The data that has just been released on the Privacy International Website included the following:

  1. An updated list of companies selling surveillance technology, and
  2. Naming all the government agencies attending an international surveillance trade show known as the wiretappers ball.

Some names are predictable enough: the FBI, the US Drug Enforcement Administration, the UK Serious Organized Crime Agency and Interpol, for example. The presence of others is deeply disturbing: the national security agencies of Bahrain and Yemen, the embassies of Belarus and the Democratic Republic of Congo and the Kenyan intelligence agency, to name but a few. A few are downright baffling, like the US department of Commerce or the US Fish & Wildlife Service and Clark County School District Police Department.

Now, with the aid of OpenSpending, anyone can cross reference which contracts these companies hold with governments around the world. The investigation continues…

Using OpenSpending to speed up fact-checking

Privacy International approached the Spending Stories team to ask for a search widget to be able to search across all of the government spending datasets for contracts held between governments and these companies (until this point, it had only been possible to search one database at a time).

The Spending Browser is now live at And, as the URLs correspond to the queries, individual searches can be passed on for further examination and, importantly, embedded in articles directly. Try it yourself against the list of companies listed in the Surveillance Section of the Privacy International Site (Just enter a company e.g. ‘Endace Accelerated’ into the search bar).

The Spending Browser will become increasingly more powerful as ever more data is loaded into the system.

Want to help make this tool even more powerful? Get involved and help to build up the data bank.


You can read more about the background to these stories on the Privacy International Site and recent coverage by the International Media:

Hakuna My Data: NBO Data Bootcamp

Lucy Chambers - January 30, 2012 in Data Journalism, events

This post is by Friedrich Lindenberg, developer on OpenSpending.

“My Name is XXXX, I am a member of the Kenyan parliament for the constituency of XXXX in the 2007-2012 election cycle. During my time in parliament, I have positioned myself against taxes for MPs.

Of the Development Funds allocated to my constituency, I have spent 12mn KSH in 2010 and 8mn KSH in 2009. Since 2007, I’ve funded 201 projects, of which 72 (9mn KSH) related to Education, 56 (7.2mn KSH) related to Health and 20 (4.2mn KSH) to Infrastructure.

The largest projects I have funded include… ”

Auto-generated, spending data-driven campaign speeches like this are just one of the many ideas of the Data Bootcamp that took place in Nairobi last week. Invited by the African Media Initiative and the World Bank Insititute, about 70 participants – both journalists and developers – met on Strathmore University’s campus to learn and practise both the skills and tools required for data-driven reporting.

The four-day programme combined tools training with practical work in small groups. Elena Egawhary (BBC NewsNight) gave a workshop on data analysis in Excel, Sreeram Balakrishnan (Google Fusion Tables) introduced both Refine and Fusion Tables. Team members from both the Kenya data portal and the World Bank finance site presented their respective offerings, while Gregor and myself from the OpenSpending team gave intros to web scraping and advanced
map visualisation.

During group work, journalists and developers teamed up to try their newly learned skills in different domains ranging from sports (football player profiles) to education (missing toilets in schools, “The Shit Ordeal”) and the financial transparency story-telling mentioned above.

The workshop also served as a community-building event for Kenya’s young and impressive Open Data initiative. Future events, aimed at civil society organisations and polictical actors will help to further promote the re-use of government information released through the initiative.

All this is happening in a place where transparency is an essential tool to be developed: Not only is the access to information now guaranteed by the 2010 Kenyan constitution, there are also major political issues that deserve close attention from local and international watchdogs. These include not only the ongoing incursion of Kenyan troops into Somalia in an effort to fight Al-Shebab terrorist groups, but also the upcoming nationwide elections in December 2012. The elections will instate a new bicameral system of government, with many previously unknown candidates standing for office. In the previous 2007 vote, bad polling station data had quite literally led to widespread unrest and thousands of deaths across the nation.

In all, it was a fantastic to get in touch with the Kenyan participants of the workshop and to see how the organizers of the event – a brilliant team including Craig Hammer, Justin Arenstein and Jay Bhalla – are working to foster an open data community in this bustling developing nation.Given the great ideas generated during the team sessions, I’m sure this work will soon bear its first fruits.

Transparency and technology in Brazil: linking politicians to bad entrepreneurs

Lucy Chambers - January 23, 2012 in Data Journalism, Spending Stories

This story by Fabiano Angélico, who formerly worked at Transparencia Brasil, is about how technology and the help of coders can be used to highlight links between politicians and corrupt entrepreneurs. It is followed by a brief “Behind the News” interview which shows some of the time costs of datawrangling and problems faced when getting the story out.

How can transparency and technology point out connections between politicians and bad entrepreneurs? Well, first of all you will need some information about the politicians and about the entrepreneurs.

In Brazil, in spite of the historical lack of transparency in governments (Brazil’s freedom of information law was sanctioned just late last year), the Electoral Court has been proactively providing information on political candidates since 2002. One piece of info is the financial donation to the candidates, containing info about who is donating to whom and how much. Although this database is released only after the elections — the info would surely be more powerful if it were released DURING the political campaigns –, one must admit this is a rich source of information.

January, 2010. Elections for President and for the Parliament, as well as for State Governors and State Parliaments, would happen in only 9 months time, in October. However, many people were already discussing them.

At that time, 2010 had just begun, I was at work, thinking of how to find rich and useful information on the candidates. Then I was reminded of the so-called “Dirty List” — this is a list regularly published by the Ministry of Labour which indicates the companies and farmers who are caught by government officials using workers in very lousy conditions, similar to slavery.

The list published in the Ministry’s website is in not-so-friendly PDF format, but it has a plus: there is not only the name of the companies or the entrepreneur/farmer, but also their registry numbers within the government. I remembered that in the Electoral Court one can also find the numbers. That was important because having the registry numbers would avoid ambiguities.

I had both lists: the donators to the previous elections (2008, 2006, 2004 and 2002) and the “Dirty” companies. But I had a problem; I did not know how to matchup the datasets. My tech knowledge allowed me to transform the PDFs into CSV, but I could no go further without help.

I then sent the datasets, in CSV format, to Transparencia Hacker, a Google Groups list which now gathers over 800 people interested in the connections between transparency and politics/public administration.

Within 2 days, the guys made the datasets talk, and we found that 16 politicians had been elected with the help of “Dirty” money in the 4 previous elections. Other 13 politicians had received donations from the “Dirty List” but had not succeeded in winning the elections.

A local newspaper told the story.

In October 2012, there are local elections in Brazil. Hope we can shed even more light in the candidates.

Behind the news:

Roughly how long did it take you to extract the data from the PDFs? Do you know how long the guys from Transparencia Hacker spent working on the data?

This was kind of easy. It took me just some minutes. The “Dirty List” is a 20-page PDF. I always use a website to convert it into xls or csv (I like Cometdocs for this work).

Here is the Dirty List, in PDF (last updated on the 8th of November, 2011; the list we used is in CSV but it it very outdated because it was due to January 2010)
Here are the Electoral Court pages for the list of donators: 2002, 2004, 2006, 2008 and 2010.

What I asked the Transparencia Hacker community was to check whether the CNPJs (companies register number within the governments) in the CSV would match any item in the Electoral Court webpage. The guys worked on the data for 2 days.

Is sufficient data available to visualise the total amount lobbyists donated to political campaigns, and would it be useful to / no? If you were to visualise the info – what would the priorities be to show? Would any tools be useful to explore the data?

Yes, there is enough data. And YES, it would be very useful to visualize those links. I would prioritise the presidential and governor candidates as well as some Congressmen who hold top-positions in both Houses of Congress. Also, the donations to political parties (not to individual politicians) would be a plus.

A search form would be very useful. The search could have filters for position (Presidential candidate, governor candidate, political party etc), geography (Brazil, states) and donators (with no filters, just a blank for writing)

In your ideal world, in time for the impending elections – what would be done differently from last time? Any additional data you would like to see released?

I’d have to think more carefully to respond that, but concerning additional data: the number which identifies the market (the field) in which the companies work.

Interested in writing a “Behind the News” piece for the OpenSpending blog? Get in touch via our twitter account or email info [at]

Some useful links (mainly in Portuguese):

How Spending Stories Spots Errors in Public Spending

Lucy Chambers - December 5, 2011 in Data Journalism, Spending Stories

This article was originally published on MediaShift Idea Lab and was co-written by Martin Keegan, project lead for Spending Stories and Lucy Chambers, Community Coordinator for OpenSpending.

How public funds should be spent is often controversial. Information about how that money has already been spent should not be ambiguous at all. People arguing about the future will care about the present, and if data about past or present public spending is available, many will certainly look at it. When they do, occasionally they will find errors, or believe themselves to have found errors.

OpenSpending, which aims to track every (public) government and corporate financial transaction across the world, encourages users to:

  • augment the existing spending database with additional sources of data
  • use that data — e.g., to write evidence-based articles and formulate informed decisions about how their society is financed.

Spending Stories is our effort to make OpenSpending a natural way to do data journalism about public spending.


The Problem

FACT 1: Errors occur in data, no matter how official the source.

FACT 2: Data wrangling (manipulating or restructuring datasets to correct inaccuracies, remix with other datasets to augment the data, or perform calculations on the data), generally improves data quality, for example, through reconciling entities and flagging amounts that are obviously incorrect.

FACT 3: Data wrangling can also introduce errors if not tackled correctly.

Crucial to ensuring the use of this data in articles or ensuring re-use by concerned citizens is the ability to show that the data is valid. In addition, maintaining a good relationship with public bodies who are confident that they are not being misrepresented in the data is vital to ensuring the data continues to be released in the first place. In practice, this means that the provenance of the data has to be clear including:

  • where the data originally came from (preferably a URL)
  • whether anyone (e.g., government, community data wrangler, or OpenSpending) has worked on the data since it was published, and what steps they took to change the data (i.e., these steps should be reproducible to produce the same result)

The OpenSpending team has gone to lengths to retain enough information to say who was responsible for both of the above.

OpenSpending is a system, somewhat like a wiki, which allows you to track back through the data wrangling process and work out what changes were made to the data, when and by whom.

Error reporting in practice

OpenSpending recently received a pointed inquiry from the U.K. Treasury disputing the claims we were making about the payment of British public money to a private company. Believing that an error had been introduced, we attempted to retrace our steps and find out where this had occurred, and who was responsible.

As we discovered, the payment had actually taken place, but the the OpenSpending descriptions used to label the transaction were not sufficiently detailed to accurately reflect the item in question.

With Spending Stories, we were able to retrace our steps because we had preserved a copy of the software tools we used for collecting the data (the data is published by about 50 public bodies, and must be downloaded, stitched together, and firmly molded into shape). These tools had been also made available to the public, so the Treasury and other concerned citizens could have checked our work themselves; the availability of this kind of check keeps all participants in the fiscal debate honest.

What had gone wrong was a problem of terminology: The transactions existed, but ambiguous language had been used to describe them, glossing over the distinction between the government department reporting what money had been spent and the government agency which actually spent the money. The bodies in question were the Department of Health and a regional health care trust; this distinction is certainly one which a concerned citizen would expect to be made clearly — so we should make sure our system makes it easy to know which question is being asked.

Checkpoints in OpenSpending

In the short term, we are mitigating the problem of data errors as follows:

  • Data provenance – is the source identifiable and the process reproducible? OpenSpending encourages people to add modified datasets to a “package” in the Data Hub. This allows other users to see the original document alongside any modified documents and track the chain of changes made to see clearly which points errors could have been introduced.
  • Crowdsourcing feedback on spending data.
  • Permitting re-use of the structured data we present, so that it can inform decisions in other fact-checking systems.

Ultimately, we will build our part of the ecosystem to provide feedback to the political process, by improving democratic discourse about the public finances.

Lucy Chambers is a community coordinator at the Open Knowledge Foundation. She works on the OKF’s OpenSpending project and coordinates the data-driven-journalism activities of the foundation, including running training sessions and helping to streamline the production of a collaboratively written handbook for data journalists.

Martin Keegan is a software engineer and linguist, currently leading the Open Knowledge Foundation’s OpenSpending project. He is also on the Open Knowledge Foundation’s board, and has worked for SRI, Citrix, University of Cambridge and co-founded and worked for various civil society organizations.