Just a snapshot from my work this week.
With thesis submission day rapidly approaching (next Thursday – finally!), I’m just putting the finishing touches on my network visualisations. Obviously the real thing is higher quality – this is just a screenshot – but what you’re looking at are two things: a) the way in which 135 Perth-based food bloggers network their identities online, and b) the way that Perth food bloggers link to each other via blog rolls. (Tracking the comments on blogs would have been more useful, but I have run out of time to do it.)
I’ve also got overlays that show the links between their Twitter & Facebook pages, as well as how they all fit together.
The text is very small, but basically I use a coded system to designate each individual (for example, if my blog was in there it would be ‘bw’ – beyond words – and then my linked Facebook page would be ‘bw.fb’, and my Twitter profile ‘bw.tw’, and my Urbanspoon profile ‘bw.us’ and so on). I’ve done this for a number of reasons; partly, it was to keep the labels for each node short, so that they didn’t take over the graph, but also to add a degree of anonymity to my results (rather than saying Blogger X links to pages w, y, z), which allowed me a degree more freedom with my study (i.e. not having to get signed permission forms from everyone on the list).
None of my research in any way discusses content on any particular blog; everyone whose work is directly featured in my thesis (such as quotes from blog posts) has granted permission for me to use that information. Everyone else is just a dot, and a link.
Blogs are represented by black dots; aqua represents Facebook; blue is Twitter, etc. I found that bloggers link to sixteen different social platforms (including social networking sites, social bookmarking sites, social recommendation sites, and social curation sites) from their blogs (in order from most popular to least): Urbanspoon, Twitter, Facebook, Instagram, Pinterest, Flickr, Google+, YouTube, Foodgawker, LinkedIn, Posse, Yelp, Last.fm, Tastespotting, Tumblr (blogs that were hosted at Tumblr didn’t count towards this; if they did, Tumblr would come just after Google+, but I counted them as blogs for the sake of consistency), and Vimeo.
I haven’t included platforms like BlogLovin’ on here. I was going to, but in the end was having such problems with it at the time of data collection that I left it out. I’ve also undoubtedly missed some platforms. I used a site called IssueCrawler to get the initial links from blogs. Basically, I uploaded a list of URLs (i.e. all the blog URLs), and the returned results consisted of a spreadsheet with every URL linked from the blog’s main page. I checked everything manually a few times, as the IssueCrawler results weren’t perfect (some blogs appeared to have no links outwards, which proved to be incorrect based upon my double-checking).
I also checked everything a few times to make sure that I had the most accurate sample possible. A major challenge came in the form of collecting the sample group. Not all blogs h
One thing that I am hoping to do in the future if I have time is to create a dynamic visualisation of posts, which will plot the occurrence of blog posts across a period of time (back to about 2004, as I think that’s when the earliest post from this group was made) corresponding to the restaurant/cafe/location they blogged about, placed over a map of Perth. I’m not sure how I will treat home-cooking/recipe posts, but these could possibly be plotted as well, although not to a location (maybe by theme or primary ingredient).
Blogs are somewhat of a passe subject these days, but I’m quite fascinated by how the medium has persisted. We have all these other ways to communicate, as the social platform links attest (for instance, 51% of the group has a linked Urbanspoon profile, 49.5% a Twitter accounts, and 47.5% a Facebook page dedicated to their blog). However, blogs offer the opportunity for longer-form expression that few other platforms allow at this stage. (Newer platforms like Medium are changing that as they straddle the boundary between blog space, collaborative environment, and SNS.)
Because I’m specifically looking at Perth, these links just serve to demonstrate how closely knit the Perth food blogging community is (and, by extension, how close other online communities are). My research focuses on the ways that place identity can be encountered and expressed via locative and location data, so the crown in the jewel of my research is a much bigger map that looks at how Perth bloggers and social media users have talked about Eat, Drink, Perth over the past four years.
The decision to research that particular festival wasn’t entirely arbitrary; there are important, undeniable links between food, geography, identity, and community that are vital to my research, so EDP has been a useful vehicle for exploring local networks in more detail. I’m still working on finalising that visualisation. However, it’s taking a loonnnng time. I’ve collected the geographic coordinates for every EDP event from 2010-2013, as well as (I think?) every online news article, tweet, and blog post about EDP. (Probably not all; there are limitations. For instance, I have no access to private data, such as protected tweets, nor did I want access to them for this project as I am only looking at information that is publicly available. In addition, EDP/Show Me Perth remove content from their website and Facebook page every year before launching the new event. The Wayback Machine was somewhat helpful, but there’s no doubt I’ve missed stuff.)
All that information is being plotted on an incredibly complicated network visualisation that I will share here once it’s done. The graph corresponds to the geodata I’ve collected; for instance, all posts about the Butcher’s Picnic link to the node for that event, which is located (on the map and IRL) in Northbridge Piazza. There are also different levels of links for comments, trackbacks, and different colours utilised to represent different years of the festival. Fun!
a basic example of how the main network visualisation is structured — bright green: event location (geolocated on a map). teal: events held at that location during one edp year (in this case, three; this is just a dummy example, there may be more/less in a given year). purple — blog posts about specific events. red: bloggers (the actual blogs). if you look closely here, this network depicts ‘blogger a’ as having written both of the blog posts (purple nodes); ‘blogger b’ was linked to in the mad hatter’s tea party post (there’s a tiny arrow pointing out from that node to the ‘blogger b’ node). imagine this, hundreds of posts over, for four years…
It actually is very fun. It’s just super frustrating and time consuming getting the data to appear in a way that is logical and informative, rather than just being a splash of colour on the screen that isn’t really any good for telling a story.
I’m thinking once I’m done with everything and have a spare moment, I’ll publish a list of Perth food blogs/Facebook pages/Twitter profiles on here, in case anyone is interested. I’ll also have high-resolution versions of my visualisations available too.
Here, have a picture I made.
I was trying to get Gephi to make Seadragon-worthy versions network visualisations (news flash: it won’t for some reason) and thought I’d play with something less…subjective? My patience for my thesis data is hanging by a thread at the moment so I didn’t want to mess around with it too much.
Instead, pulled Australia’s population data from Google’s Public Database and mapped it by state…however I suspect something is up, as it doesn’t seem right that Queensland has all the big cities (I also don’t think it’s right that 1.06 million people live in the Brisbane local government area, but I may be wrong). Ah well. It was nice to look at something other than my regular ol’ network.
I’m annoyed at Flickr and Gephi today.
I used NodeXL on my Windows laptop to pull this data out of Flickr and then used Gephi to visualise it on my Mac. I’ve been doing my own collection (i.e. manual trawling) too, but thought I’d see how NodeXL’s results compared. This is the result of mining the tag ‘Perth’ in Flickr. NodeXL pulled the data on tags relating to Perth (i.e. the other tags that users have given their images when assigning the tag ‘Perth’).
But… as you can see (sort of – terrible pic quality), a lot of the tags are pretty non-specific. The most popular tag relating to Perth is bird somehow, which makes me think that I’ve actually messed this up as I don’t see how bird can be bigger (i.e. used more often) than Perth.
I went through and deleted all the tags that I felt were too generic or related to somewhere other than Perth entirely, and this was the result:
Better, but still a bit meh. I guess I’m kind of disappointed by the lack of specificity in the tags that are chosen, and I’d hoped that of the 700 or so tags used by people who also used the Perth tag, more would be … Perthy. The only suburb names that popped up were Cottesloe, South Perth, Crawley, and Fremantle, which I found surprising. But, you can’t change the data. It is what it is. My research so far for this chapter has generally showed that people are lazy with tags and tend to tag ancillary information (username, camera model, misleading keywords like “sexy” and “beautiful” and “girl” — the things people search for — on photos of, say, the beach) more than they use tags as a descriptive device to tell users something about the actual image. Something like 90% of people uploading photos of Perth in 2004 were assigning tags, and they were useful tags. Only 68% are doing so today, and few of them are useful image descriptors.
I’m writing this chapter because I’m interested in the stories that we tell about this city without explicitly using words, and metadata/tags are one of the best means we have of sorting and accessing visual content, yet it’s appearing to be a massively under-utilised affordance of the platform. I suspect that it has something to do with the sheer amount of content that is being created and shared; uploading an image to a photo sharing site such as Flickr is such an everyday act that taking the time to assign meaningful tags seems unnecessary. I know I’m guilty of this as much as the next person, so I don’t know why I was expecting to find something more satisfying in my research, but I suppose that’s an important part of the story that this chapter will tell as well.
Today I’ve been mapping some social networks on Gephi, and really trying to develop my knowledge of the program a bit. I wish that I’d come across this a long time ago because I do feel as though I’m rushing to take everything in at the moment, but it’s going okay for now.
I started by mapping my Facebook network. I used Persuasion’s fantastic guide on mapping Facebook networks, which has definitely been one of the most useful guides I’ve seen to mapping data on Gephi. Unfortunately, for me anyway, it’s been a bit of a mystery and has involved an awful lot of guess work to try and figure out how to use it. I’m still trying to get my head around .csv files.
I retrieved the connection data using netvizz – a Facebook app that trawls your network and pulls out all kinds of data. You’ll need to install the app to do the same (just search ‘netvizz’ within Facebook), but it’s fairly straightforward. I imported the data to Gephi as a .gdf file – make note, because this took some fiddling around with for me. I couldn’t get the file to just save as .gdf for some reason, but I eventually got there. (You might not be as new to this as I am so it might not be such a headache!).
Following Persuasion’s guide, this is what I came up with:
I added the labels myself as annotations in Preview. They’re not exact, but, they’re pretty close. I’ve set the parameters to exclude anyone that has no network connections with anyone else on my list. There are a couple of people who are (technically) mislabeled here, as they might provide the most significant number of connections to others (for example, the individual that has the largest node is a friend I met in 2007, but through her I have met a lot of people since 2010, so she is in that ‘group’).
Pretty cool hey? I’m hoping to do some more updates as I have more of a play around with the program this afternoon. I’m going to try to retrieve some information about place from the data now, so hopefully there’ll be something interesting to see later on.
I’ve collected in excess of 300 subjects in my list of Perth bloggers, and am up to the letter ‘F’ in plotting them. I’m using Gephi for the visualisation, and despite a rocky start (i.e. me having no idea what I was doing) I’ve now got the hang of it and it’s starting to look pretty damn cool!
Probably the craziest thing is that this list just keeps on growing – I’m probably discovering 20 new blogs a day, at least, but I’m only plotting those that are active bloggers (i.e. have posted within the last year and posted regularly before that, and user another platform – Twitter, Facebook, Instagram, Flickr, etc – as well as blogging). What that means is that there are potentially hundreds more.
Every dot on this graph represents a blogger, and every line is a link in or out of that blog (you might be able to see the tiny arrows pointing the direction). The dots change size as they attract more inward or outward links. The colours are significant too – the pink ones are fashion bloggers, the purple are food bloggers, pale blue are lifestyle bloggers, etc. This is going to change so there’s not too much point going in to it here; it’s just an easy way for me to keep track of what’s going on.
There are labels, too, so I know which dot represents which blogger, but I’ve kept them hidden to protect the identities of the geeky ;)
Including this data in my thesis in visual form is a bit of a gimmick – I could just provide a bunch of stats and numbers – but I feel that it’s really helpful to be able to see what networks look like. Not all blogs are equal, and not all share equal involvement in the blogging community. Of course, this data simply represents the network at this stage; it says nothing about the quality of content (not that I really get to be the judge of this!), how popular the blogs are (a blog may have few inward links but be read by a significant number of people, and certain genres are more generally popular than others), but it’s a good start. I’ll be doing the same thing with some other networks too, particularly Twitter, as Twitter has stolen a lot of blogging’s thunder in recent years.
Tim Highfield from Curtin has been a massive help with pointing me in the right direction on this one. Check out some of the stuff he’s done with visualisations – his look way cooler than mine.
If you’ve ended up here via Twitter, there’s a fair chance you responded to my tweet for Perth bloggers, so I thought I’d write up a little bit about what I’m doing.
I’m in the final stages of my PhD, and writing up the thesis is… an adventure, to say the least.
Basically, I’ve been observing a whole heap of Perth blogs for the past four years with varying degrees of commitment (often not much), but as I’m nearing the final stages of writing up I really need some hard data to back up my ramblings.
That’s where you come in.
In the initial stages I’m going to compile a list of bloggers (which I will make available here if anyone is interested). Then, I’ll be looking at which other platforms they’re using: Twitter, Facebook, Pinterest, and so on.
I’m going to use a pretty snazzy open-source software called Gephi to plot the connections that exist between the people whose blogs I’ve been following. So I’ll track links between blogs, links between Twitter feeds, links between Facebook pages, etc. until eventually I should have some pretty awesome visualisations of what Perth’s online community looks like.
I will update here over the coming weeks with what is going on, but if you would like to know anything more please feel free to email me or leave a comment here.
A note about how I am using this information
I won’t be doing anything unethical with your information. Nothing will be made public that is not already public online. If you do not use your real name online, your name won’t appear online or in my thesis. I certainly will not be publishing anything like personal contact details.
If you would prefer not to be involved, just contact me and I will take you off the public list (and remove you from my research data altogether).