Reflecting on the Leeds #Datadive

Last Friday night,  I found myself in a sun filled loft workshop in Leeds. All the people in the room seemed to be in one corner, but that’s where the (free) bar was.  Tables are set out in rows. Solid wood and rubber topped refugees from the re-fit of Birmingham library. They are already filled with laptops.

This loft space belongs to the ODI’s node in Leeds The laptops belong to data scientists but the people are a mix of the data savvy,  local and national charities. All here for the first, it’s hoped of many, DataDives.

The event was organised by DatakindUK, a chapter of the US group Datakind who “create teams of pro bono data scientists” to work with organisations to solve problems. Local charities are invited to pitch requests for help. If selected they provide data which ,in the run up to the event is cleaned up by data heroes, ready to be pitched at the start of the weekend.  Local organisations also pitched in data. Leeds City Council and their DataMill, for example, had offered up data to use.

So, after beer and chat, the three charities pitched their problems.

IMG_4625

  • Volition, representing a large network of mental health organisations in Leeds, had a common problem. Lots of information about the organisations and their work (literally a database of the stuff) but wanted to link it with data about mental health issues in Leeds.
  • Voluntary action Leeds had stacks of interviews with young people, exploring the issue of being a NEET (not in employment, education or training). They wanted a way to sift the text to look for common themes and also wondered if there was a way of detecting unknown Neets in existing demographic data.
  • The Young Foundation (who also co-sponsored the event) have recently set up a new project in Leeds gathering information around financial exclusion. The project, part of a broader range of projects Leeds are running, looks at the growth of loans, payday lenders etc. They wanted to surface data around the issue.

The rest of the evening was a kind of slow-speed-date where the volunteers in the room pitched themselves and their skills and where wooed by the charities. Eventually splitting into teams to get to work on the Saturday and Sunday.

Datakind

Datakind are an interesting organisation and new one on me.  They are clearly very much at the altruistic end of the hackathon/datalab movement. Their founder is Jake Porway who used to work for the R&D lab at The New York Times (it seems you’re never far away from journalism!). He told Wired that he wanted more from the data boom that was happening around him: “the things that people would do with it seemed so frivolous — they would build apps to help them park their car or find a local bar. I just thought, ‘This is crazy, we need to do something more.'”

That more isn’t just the pro-bono aspect – free data scientists.  The Datakind people in the room are also there to pass on skills to the organisations.

It was great to see the charities getting excited about the possibilities of everything from simple tools like Wordle to more complex text analysis software and maps.

Sunday afternoon and it was time to show and tell.

The end results where a real mix of the complex – synthetic personality types for identifying the financially excluded – to simple infographics. But there was real impact in the data on the people in the room perhaps best exemplified by the debate and discussion that was generated by an extra mapping project that sprang up during the weekend.

They simply took the datasets each group were finding/generating and mapped them. Technically, not that much of a challenge (except for a tricky issue with local government boundaries) but the insights where immediate.

IMG_4655

Where is the value.

When I spoke to representatives of the charities, there was a general feeling that data was important. They all recognise that the third-sector is fast becoming data-driven. But beyond the process of writing reports or bids, the real value of data was still to be explored and understood. It just feels important.

The complexities of the third-sector ecosystem don’t help when it comes to raising awareness of events like this though. Even when free help, and experience is on hand.

When I asked people about how they found themselves at the event, it revealed a complex web of umbrella groups, agencies and initiatives – understanding that would need a datadive in itself!  The organisers where similarly challenged; pulling the event together had proved a slower and more complicated process compared to their London datadives.

Good people. Good work.

After the ODI summit last year, I found myself reflecting on the difficult line there is between the power to do good and the power to do business that data provides and after the event I found myself chatting  through similar issues with Paul Connell, one of the founders of the ODI Leeds node. He was pragmatic about the challenges; balancing the urge to do good with the urge to create the new Uber. A tension that often makes hack events tricky spaces.  So, with my research hat on,  its tempting to start try and unpick the motivations of activities like this beyond the desire to give those people involved “the warm fuzzies” as Datakind put it on their homepage.

But the vibe at the Leeds Datadive event really did make it feel impervious to scepticism.  The results, rough round the edges as they were, felt ‘useful’.

As an example: One of the teams, analysing data around NEETS, looked at sanctions imposed on young jobseekers (the stop in benefits that’s imposed if you don’t tow the line with your employment service).  Sanctions vary, but you can get 4 weeks ‘ban’ for missing an appointment.  Mapping the data seemed to make a compelling point – the most sanctions were applied to people who live furthest away from the job centres. That peaked a fair bit of interest from journos in my feed (even on a Sunday morning).

Whether further analysis proves that or, more likely, reveals the finer detail, is moot. In a short space of time, simple but no less surprising truths about the experiences of people in Leeds were revealed.

DatakindUK hope this is going to be the first of many events outside of London and I’d make a point of tracking them down next time.

How can you spot a digital native? Check their little finger.

Apropos of nothing really, I got into an interesting chat with some of the third-year journalism students about how our use of social media would evolve. I wondered aloud about how the physical way we access information might change us.

fingersandthumbs
Writing blisters Vs Phone rub .

 

I pointed to my middle finger as an example. I have, albeit smaller than it used to be, writing blister. The result of  pressing too hard on my pen through years of school. At it’s peak it was an ink-stained blog on the end of my finger.  Checking with colleagues, they all had the same. Different fingers, but the same rough patch.  How likely, I wondered, was it to have a writing blister today?

According to my students, and I asked the same question of the prospective students I spoke to today, not very. But what they do have is a rough patch of skin on the inside edge of their little finger. It’s caused by resting your phone on your finger when using it. Others reported flatter finger ends or callouses on the ends of their fingers and thumbs. But the rough little finger was the most common.

It got me thinking about shibboleths.  The ways we can distinguish between natives and those new to a culture and it’s landscape.  It’s been interesting to watch people quietly check their little finger and check whether they carry the mark.

The most important code you can learn: the hyperlink

<a href="http://andydickinson.net" > Andy's site </a>

It’s amazing isn’t it? How something so simple can be so fundamental.

In this world of content management systems, data journalism, javascript, python and all the other coding and technological innovations we are compelled to explore, it’s sobering to sit back and reflect on its simplicity.

With this one piece of code you can link a quote from the Chancellor’s budget speech to the figures from the Office of Budgetary responsibility.  You can take people back to events from years before to experience them as if for the first time. You can take someone from Preston to Beijing in one click.

It’s a time-travel machine, a star-trek like transporter, a silent voice in the background of your writing, ready to pitch in and explain or define.

They are the currency of the web – without them Google wouldn’t exist. But it’s ubiquity can also cheapen it. Millions of snake oil salesmen would be out of (second) jobs.

In a digital world I think there is something powerful, almost physical, about being able to add a link ‘the old fashioned way – typing it in longhand. It bypasses the uncritical. Subverts the automated. It offers time to reflect. And in that it creates value.

If you understand the value of a hyperlink, you understand the value of a connection.

How are you going to use something so powerful?

Oborne and the fetish for old school journalism

Like many in the media industry I’ve been confronted with a wall of coverage around Peter Oborne’s resignation from the Telegraph. I read his piece when it came out but have sat on my thoughts. That’s mainly because I can trust to those better qualified than me to debate the meat of his criticism – the undue influence of advertisers, which others have developed to also include the influence of proprietors.

But the bit that I’ve been chewing on for the last few days was the first part of Oborne’s resignation piece – the bit I’ll call the ‘don’t forget the print edition bit’

When I first wrote that par it was the ‘ I don’t like digital bit’, but I realize that’s not fair. He’s pro-paper. Clearly the “country solicitors, struggling small businessmen, harassed second secretaries in foreign embassies, schoolteachers, military folk, farmers—decent people” don’t do digital either. If I apply a microscope to the piece there’s recognition of digital. Oborne isn’t “saying that online traffic is unimportant”.

No. He’s saying that the Telegraph has turned it’s back on good journalism and digital is part of that.

So I still bristle a little when I read the piece. Not because of the apparent lack of journalistic integrity  in the British press – who knew! It’s not even because I might think that the Telegraph’s digital strategy is right or wrong.  I bristle because, by design (and I credit Oborne with enough editorial skill that everything  is considered in that piece) he’s conflates digital content with editorial decline and an inherent editorial weakness. Somehow there is a direct line between digital and bowing to pressure from above. Both responsible for the death of ‘quality journalism’.

The fetishising of editorial value.

Like many others who rhetorically define quality journalism at the expense of digital, Oborne takes the freakshow approach and parades a three breasted lady as evidence of the base nature of digital whilst at the same deftly stepping off the stage to point out where the extra tit is stuck on.  All the while avoiding the fact that he always remains a member of the circus.

This ability to be in journalism but not of parts of it is a common trope – the idea of what you do as quality journalism vs well, anything we don’t think is quality. It often comes with a generalized view of what constitutes journalistic values. It’s common across the generations – some young journalists covet the halcyon days of old-school-journalism as much as some of the older generation love to recall them. These were the days of long lunches and masters of a craft not process. The days of country solicitorsschoolteachers, military folk, farmers—decent people with decent values.

And whilst there may have been a golden age of journalism – at least for those who enjoyed them – some of the reaction to and in part some of Obourne’s complaints, show just what a fetish that’s become.

Oborne himself invokes one of the most fetishised parts:

It has long been axiomatic in quality British journalism that the advertising department and editorial should be kept rigorously apart

Desired, yes. Axiomatic? Really! Self-evident? Unquestionable? I think you’d have to be quite selective about your journalism to stand by that statement.

Of course this is an issue of degrees. Yes, I do think there’s a difference in taking a holiday companies junket vs. not running a story about HSBC.   But how long is it going to stand up to scrutiny beyond a single journalists own view of their integrity.

So I’m 100% behind anyone that stands up for a point of principle, Oborne included. He’s as mad as hell and isn’t going to take it anymore. And good on him for stepping away when the Telegraph failed to stand up to his values – his values.

I’m less happy to see digital so lazily used to paint a broad stroke picture of bad journalism. In Oborne’s case, especially when the second half of his resignation letter offers a much more compelling and, from a UK press perspective, fundamental example of the problems with journalism.

The bottom line is, more than anything else I dislike about this story,  I’ve now got to wade through a mass of people (and I might add an alarming number of them journalists) going “look a SERIOUS journalist has resigned because of digital. He thinks it’s crap”. To borrow from Oborne’s experiences, as a digital advocate I’ll have to put up with people telling me “You don’t know what you are fucking talking about.”. Thanks.

Light the flaming torches and stand back: Are you a good leader of your social media mob?

I’m pulling together my yearly online journalism ethics lecture. It’s the fifth-ish lecture on this module (some previous ones online) and the fast-moving nature of this stuff means I’m really starting from scratch but I always go back to previous ones to see where my thinking was.

A prevailing theme for me has been how the ethical standard is set and who sets it. The online landscape clearly stretches moral and ethical concerns and the question for me has always been about how much of that we take on board, how much we take on the norms of the web, and how much is a more fundamental journalism ethic that we should stand by.

In questioning that in my lectures over the years, what I’ve noticed is that the tone of online journalism has changed. It’s divesting itself of some of the tradition and reveling in the norms of the medium. Ethics is on the move and the volume has gone up.

So this year my general starting point for the lecture is that outrage is the new journalism.

Outrage is nothing new for online journalism (and it’s not a new observation). Take comment sections on news sites. They are great examples of outrage creation – baiting readers with a story you know is going to get comments regardless of the tone of the comments.

Take this example from the Daily Express about the apparent calls to move a grave because of it’s proximity to a Muslim grave. Skip to the comments and revel in the outrage. By a strict reading of any rendering of a professional journalism ethic, it seems pretty hard to defend.

The argument about who is responsible for the comments on a site is well-worn. Comment systems work within resources and the law.  Despite efforts by some publications to curb offensive behavior, the idea that the publication or the journalist take any responsibility for eliciting these comments in the first place seems moot. Even if they are providing the target, the damage is done by those who pull the trigger – the people who comment.

This form of outrage creation is also now common in social media. A casual tweet or post –  ‘you won’t believe what this person just said’ – and a viral hit and loads of links later most walk away. But not everyone.

Increasingly I’m seeing a different form of outrage creation. It’s not the fire-and-forget of an article and it’s comments, it’s sustained, crowd-sourced, journalist as brand-outrage. It’s I’m outraged and I want you to share that outrage . Literally share it. Retweet, hashtag and join me in confronting the source of my outrage.

We can tell ourselves that this is simply engaging with an audience. This is the power of social media to right wrongs. It may be. But by another name it’s an angry mob. It may be hashtag shaped pitchforks and flaming torch apps but it’s a mob and it’s your influence (often affiliation with a recognisable journalism brand) and audience (a healthy follower count) that they gather round.

In the social media world its easy to see follower counts as a gauge of popularity. Like audience figures or circulation counts. It’s easy to forget that they are individuals with the capacity to reach out and touch. Perhaps that’s why it can take journalists by surprise when they turn on you.  Still, it can be deceptivly easy to distance yourself from the activities of your audience – they aren’t friends, You don’t follow them; a useful degree of separation.

So when someone posts something vile on social media or trolls another user using a link to your work or a hashtag you’ve promoted, its easy to fall back on the same rhetoric that’s used for commenting on web sites. You might make the ammunition but you don’t fire the gun.

What’s the difference? Is someone who goes on to troll a target of your outrage any less of a responsibility than a commenter on your website? Remember this is ethics not law.

I would argue that whilst the comments on a website help create and feed a mob (with all the issues that can create for a site) what you post on social media means you create and lead a mob.

Social media mobs have done some great things but ethically, are you doing the right thing by and with yours?

Notes:

 – I know by citing the Daily Express I’m not doing myself any favours. It’s easy to write them, and the commentors, off as some kind of nutjob fringe. Sadly they are journalism. For the sake of this post the visibility and tone served a purpose. I’m sure that journalists from sites with more active moderation (and more generally agreeable politics) would testify to no less offensive and distressing material appearing on their virtual doorstep. 

 – I tried really hard not to push the gun/arms metaphor here but forced to I’d have to say that I don’t think journalists on social media are like gun or ammunition manufacturers, even though the logic of distance against blame makes for some very similar ethical positions. For what it’s worth I think a lot of journalistic use of social media is more like the activities of the National Rifle Association. 

Data Journalism: Meyerism and Vox’s new data team.

A few weeks back I wrote a post about Data Journalism and how it was defined (on Wikipedia at least).  So I was interested to read Vox’s take on it when they announced, last Wednesday, that they were creating a Vox data team:

Interestingly for me,Vox co-founder Melissa Bell, sees their kind of data journalism as a direct descendant of Philip Meyer’s Precision Journalism work on the Detroit Riots (1967).

Forty-eight years after Detroit, precision journalism has given rise to data journalism, which has become a much-touted new media trend.

So Vox’s ‘data journalism’ is 21st Century Precision journalism.

Philip Meyer has become something of an adopted parent to data journalism. The work was not just groundbreaking, more importantly in my view, it was disruptive. It was disruptive to the status-quo of accountability – the assumptions made of those about the rioters. It was also disruptive to journalism. Meyers first iteration of Precision Journalism was directly challenging a prevailing form of literary journalism that many saw as undermining truth and trust. It put science before journalistic belief.  In doing that it was also part of a bigger disruption of sociology – a new wave. It’s no surprise then that, like a patron saint, he is invoked by any new data journalism project looking to define the data journalism they do.And Meyer is a very useful starting point.

It doesn’t matter what hue of data journalism you might be, Meyer fits. For many , Meyer is CAR through and through. But if you don’t like the hypothesis driven, 20th Century trappings of CAR, well, Data Driven journalism has all the same tech but with a nicely positivist, scientific approach. A reading of Meyer that is just as likely to keep those exploring the boundaries of computational and algorithmic journalism happy.

But as much as Meyer offers an agreed (and agreeable) starting point for those looking to unpick the “much-touted new media trend” that is data journalism, for me it’s the fundamental philosophical approach that Meyer disrupts (and suggests in that disruption), that is more useful as a tool to think about data journalism and what it means.

For me, in trying to get a flavor of what’s driving (those involved in) the data journalism conversation, it often comes down to this – which comes first. The data or the question?

A proponent of CAR informed data journalism would tell you that you start with the question: ‘I know that there are dodgy MP’s there, I need the data to tell me how dodgy’. It’s all about sampling. Your DDJ fan would tell you that by analyzing and linking data we would ‘discover’ that there were dodgy MP’s. It’s about having all the data.

In a Q& A in the comments (nice idea) Bell gives Vox’s perspective on the which comes first question:

It is definitely both. You can start with an idea and seek out data to help answer the questions, or you can start with a data set and surface stories from the changes discovered within that set. Either way, it’s always about being constrained only by your imagination!

So, very much story driven. If we have the data we’ll do something with it.

Editorial Products Engineering Director Ryan Mark steps in to answer a question about the amount of raw data and covers similar ground:

It’s difficult to give a direct answer… it depends on the topic, what data we can get a hold of, and whether that data can help us bring clarity to the thing we’re trying to explain.

Digging for data takes time and doesn’t always yield fruit. Raw data usually comes in drastically different formats and structure and takes work to process and understand.

I think we’ll be collecting as much raw data and we can handle. We’ll have to focus in on the stuff that we think can add the most to our reporting

Both of those answers speak more to the ‘longitudinal’ issues of data journalism than any definition. How will resources and editorial line impact on the way you use data? How long can you stick to a Data Driven approach when resources and editorial line don’t let you gather and develop databases of raw data?  For what it’s worth, I think Bell’s comments about the structure of team tell us more about where Vox are going, alluding to a more visual, editorially responsive mode.

 

I’m excited to see what Vox come up with. As much as anything else, because what they come up with will excite others – they will be saying we want data journalism like VOX.  As much as Meyer might be the motivation, Vox and their ilk are now the dominant blueprint.

For me, Vox’s position underlines the importance of Meyer as a reference point; common ground on which to start the conversation and not much else. We can’t say that Vox would be any more or less Meyersian in its data journalism.  At best it means I don’t know where you are going but I do know where you are coming from. 

In helping me understand what data journalism is for Vox, that’s as much as I can ask for.

Is more than 100 likes on Facebook news?

Occasionally we get emails asking if we can forward ‘writing opportunities’ to our journalism students.  This week it was for student news site The Tab.

For those looking to take up the opportunity, there’s a handy guide for those asking the question ‘what is news?’

News is what people click on. If your friends are telling each other about something, if people are sharing something on Facebook, or getting angry, or laughing, or shocked, it’s a story.

They even offer a list of examples

If you see one of these things, it’s probably a story:

Now, I have to admit, when I read that my heart sank. I’ll go a stage further and say (as I did on twitter) that a little bit of my soul died.

Don’t get me wrong, I’m not a snob. For me news is anything that is current and of interest to your target audience. So at one level I have no issue with this kind of statement of a “news” agenda.  I do worry about the tone which, given that it’s aimed at University students, sounds more appropriate to explaining a particularly challenging part of Katie Morag to my 4 year old daughter.

But part of me recognizes that maybe I’m uncomfortable about this because  it lifts the curtains on journalism news agendas. It’s the grubby truth behind the pomp of ‘journalistic values’. The stripped down basics. News with a catastrophic effects failure (like Bill Bailey’s excellent riff on U2 above)

Journalism is often quite bad at explaining its values system.  Why we do what we do and what we think is news. Charlie Beckett has a nice take on that as he ponders why good news is considered, by some, to be no news at all.

So you could see The Tab’s what is news as a positive. It’s  a statement of their news values. So I’m not having a go. The Tab are just as entitled to do what they do as anyone else. It’s just that I would have preferred it though if they had prefaced it if they had qualified it with ‘here is what it news for us’.  

Why?

I don’t think it serves anyone to call this news. That’s not a value judgement; we are all in the content business.

We are in the business of putting the right content in front of the right people.  If I’m going to sell writing for The Tab (or any other publication for that matter) as an opportunity I want to be able to sell it as a way to explore the full extent of what the industry is. That means different styles and definitions of audience. That’s what I teach – Journalism as a broad church. Make it easy for me to do that and I’ll help. Leave defining news to the academics (it’s an academic argument anyway) and just tell people why, what you are publishing.

So does more than 100 likes on Facebook make news? Who cares. If it’s compelling content for your audience, then that’s enough.

Why can’t I tag my followers in Tweetdeck?

I was chatting to a @clarecook this morning – office mate and all round planet brain – and we got onto the subject of how we use Twitter to find people.

Clare noted that often she will follow people based on a conference or event that she saw them at. I recognise that approach. I often put on a glut of ‘following’ before, during and after events as you track the run-up and aftermath.

What Clare also noted was when she wants to find people her point of reference is often that event –  ‘I remember I met a person who was great on business models at #journoconference but can’t remember their name’

If you’re organised then you could have lists but lists don’t seem granular enough to cover the range. Really what you need is a way to tag followers.

We decided that want we want is something like this:

Twitter · MoqupsWhen I follow someone, it would be great if there was a pop-up that allowed me to tag them. I could add my own or select from a number of tags generated from their recent feed. That might help if I want to search for people tweeting around a conference etc.  It would also allow me to do that by actual hashtag for the same reasons.  Once followed you could search by tag and even identify the tags picked at the time you followed them.

In one respect this is the opposite of how many people want to use twitter. Any tools are about filtering the stream to get current information. But Twitter is meant to be a communication platform – a follower list is as much a list of contacts as it is a list of sources.  A convenient way to search your contacts based on context that’s more granular than lists doesn’t seem like too odd an idea to me.

Now, there may be an app that already does this. If so, we’d love to know.

 

When is data journalism not data journalism?

When it’s data driven journalism….

I’m doing lit-review at the moment (this might sound academic but it essentially consists of me yellow-highlighter-penning-the-feck out of papers and journal articles) and I came across a little loop in defining data journalism that got me thinking, thanks to Wikipedia.

Look at wikipedia’s definition for data journalism and you before you begin you’re told:

Not to be confused with Data driven journalism

Look at data driven journalism and you’re told:

Not to be confused with  Data journalism

Oh and don’t even think about confusing either of them for database journalism.

Reading the definitions there’s a hint of why. Data driven journalism is one process of the broader practice of  Data journalism. Data journalism reaches outside of journalism to encompass data science and designers.

Does that mean that I can say that if I come from the school of thought that wants to play down (or distance myself) from the idea that data journalism is about output – visualization – that I do data driven journalism? Does the difference speak to philosophical/professional position?

Just get on with it?

In one sense I don’t have a problem with the distinction – it makes a kind of sense. I’m also sure many others won’t, dismissing it with the weary sigh that prefixes  ‘what does it matter what we call it, lets just do it’. 

As an observation, I have to say it’s stuff like this that really needs nailing down if data journalism (or whatever you call it) wants to be left alone just to get on with it.

One of the research papers I’ve read (it’s a great paper btw) suggests, is that “at least part of what is considered as forming the contemporary trend of data journalism mainly operates in the realm of discourse”.  In other words the idea of data journalism is not fixed.

One reading of that is that its a developing field and in that there is bound to be an element of evolution (in the Darwinian sense). Look at the wikipedia page for Computer assisted reporting:

It has been suggested that this article be merged with Data driven journalism. (Discuss) Proposed since October 2011.

You could argue that conceptually (in the minds of those just doing it) this has already happened. The CAR page, like many others on Wikipedia, will serve as much as an archive for the term, reflecting that, at one point, it was considered coherent enough of a thing to warrant it’s own page.  USeful for me as an academic but redundant going forward.

But you could also read it as making it up as we go along – that’s not very precision is it.