Facebook Ads Topics

Following the initial data selection, I found the aspect of Facebook own's choice of advertisement topics more me quite interesting. Based on my browsing and social behaviour, they obviously think those words should trigger targeted advertising on their page – relevant for me of course.

As from Facebook's archive dataset:

Some samples:

...
James Blake (musician)
Felix da Housecat
Tom Jones (singer)
Thomas Bangalter
Amon Tobin
The Bronx
Yuksek
United Nations Democracy Fund
The A-Team
Two and a Half Men
The Chemical Brothers
Gianni
Berlin
Jamie Woon
Oliver
Pseudonym
Fred (footballer)
Beardyman
Knight
Lafayette, Indiana
Switzerland national football team
DJ Koze
Biblical Magi
Austin, Texas
Reggie Watts
University of St. Gallen
Uffie
Sound recording and reproduction
James (band)
Aeroplane (musician)
Tube & Berger
Ultra-short baseline
Revier
Scarlett Johansson
Fakt
Farmhouse kitchen
X-Press 2
Jamiroquai
Division (military)
Jensen Motors
Shipping container
Grand Theft Auto: San Andreas soundtrack
Home Improvement (TV series)
Frieda (Werra)
Zimoun
...

Google Search

As Facebook thinks these ads were relevant for me, it must me quite interesting to let a search engine automatically query the keywords for me...

In example, if I search for Tom Jones on Google:

The first image result is from Wikipedia:

Doing this over Google's API provides me a text-based result. So I could fetch the URL out of it to use it in some kind of Script or Batch processing:

Lazars-MBP:Ad Script lazar$ curl -s --get --data-urlencode "q=tom jones (singer)" https://www.googleapis.com/customsearch/v1?num=1\&searchType=image\&key={key}&cx={cx}
{
 "kind": "customsearch#search",
 "url": {
  "type": "application/json",
  "template": "https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&cref={cref?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&relatedSite={relatedSite?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json"
 },
 "queries": {
  "nextPage": [
   {
    "title": "Google Custom Search - tom jones (singer)",
    "totalResults": "98800000",
    "searchTerms": "tom jones (singer)",
    "count": 1,
    "startIndex": 2,
    "inputEncoding": "utf8",
    "outputEncoding": "utf8",
    "safe": "off",
    "cx": "{cx}",
    "searchType": "image"
   }
  ],
  "request": [
   {
    "title": "Google Custom Search - tom jones (singer)",
    "totalResults": "98800000",
    "searchTerms": "tom jones (singer)",
    "count": 1,
    "startIndex": 1,
    "inputEncoding": "utf8",
    "outputEncoding": "utf8",
    "safe": "off",
    "cx": "{cx}",
    "searchType": "image"
   }
  ]
 },
 "context": {
  "title": "website.net"
 },
 "searchInformation": {
  "searchTime": 0.299413,
  "formattedSearchTime": "0.30",
  "totalResults": "98800000",
  "formattedTotalResults": "98,800,000"
 },
 "items": [
  {
   "kind": "customsearch#result",
   "title": "Tom Jones (singer) - Wikipedia, the free encyclopedia",
   "htmlTitle": "\u003cb\u003eTom Jones\u003c/b\u003e (\u003cb\u003esinger\u003c/b\u003e) - Wikipedia, the free encyclopedia",
   "link": "http://upload.wikimedia.org/wikipedia/commons/6/69/Tom_Jones_concert.jpg",
   "displayLink": "en.wikipedia.org",
   "snippet": "Tom Jones concert.jpg",
   "htmlSnippet": "\u003cb\u003eTom Jones\u003c/b\u003e concert.jpg",
   "mime": "image/jpeg",
   "image": {
    "contextLink": "http://en.wikipedia.org/wiki/Tom_Jones_(singer)",
    "height": 3008,
    "width": 2000,
    "byteSize": 893961,
    "thumbnailLink": "https://encrypted-tbn2.gstatic.com/images?q=tbn:ANd9GcQK5-e4c6NyHJUYrsBZv0cTq6aFwo4OSlTkaFXpHWhBsFQKgOLO6bvVOOE",
    "thumbnailHeight": 150,
    "thumbnailWidth": 100
   }
  }
 ]
}

Bing Search

Unfortunately, Google's API rules have changed, therefore I can't fetch quite enough often as I'd need to. So by using Bing, I luckily receive the same first search result:

Their API allows far more queries per day, resp. month: Microsoft Azure Marketplace Bing Search API

So, by using their API and fetching it over curl, I can get the first result quite handy:

Lazars-MBP:Ad Script lazar$ curl --get --data-urlencode "Query='tom jones (singer)'" -u user:password https://api.datamarket.azure.com/Bing/Search/v1/Image?\$format=json\&\$top=1

{"d":{"results":[{"__metadata":{"uri":"https://api.datamarket.azure.com/Data.ashx/Bing/Search/v1/Image?Query=\u0027tom jones (singer)\u0027&$skip=0&$top=1","type":"ImageResult"},"ID":"6ab60591-33a3-4ffd-9ee0-2b0a80320b76","Title":"Tom Jones (singer)","MediaUrl":"http://upload.wikimedia.org/wikipedia/commons/thumb/6/69/Tom_Jones_concert.jpg/220px-Tom_Jones_concert.jpg","SourceUrl":"http://en.wikipedia.org/wiki/Sir_Tom_Jones","DisplayUrl":"en.wikipedia.org/wiki/Sir_Tom_Jones","Width":"220","Height":"331","FileSize":"19473","ContentType":"image/jpeg","Thumbnail":{"__metadata":{"type":"Bing.Thumbnail"},"MediaUrl":"http://ts2.mm.bing.net/th?id=HN.607987620957062653&pid=15.1","ContentType":"image/jpg","Width":"199","Height":"300","FileSize":"9286"}}],"__next":"https://api.datamarket.azure.com/Data.ashx/Bing/Search/v1/Image?Query=\u0027tom%20jones%20(singer)\u0027&$skip=1&$top=1"}}


Filtering through the result, I only need the URL to the specific image:

http://upload.wikimedia.org/wikipedia/commons/thumb/6/69/Tom_Jones_concert.jpg/220px-Tom_Jones_concert.jpg


Shell scripting

Now a shell script can use an input file with a keyword on every single line to fetch the URLs for every entry in that list:

#!/bin/bash

if [ -z $1 ]
then
echo "ERROR: No input file supplied."
echo "USAGE: ./bing.sh <file>"
echo ""
echo -n "Anyways for now, supply the input file here: "
read FILE
else
FILE=$@
fi

echo ""

mkdir bing
cd ./bing
> bing.log
> urls.log

echo ""
echo "Entries in file:"
wc -l ../$FILE
echo ""
echo "Fetching first Bing Image Search result for:"
echo ""

while read f; do
echo $f
echo $f >> bing.log
curl -s --get --data-urlencode "Query='$f'" -u user:password https://api.datamarket.azure.com/Bing/Search/v1/Image?\$format=json\&\$top=1 | sed 's/"MediaUrl":"\([^"]*\).*/\1/;s/.*ImageResult.*",//' >> urls.log
sleep 1
done < ../$FILE

echo ""
echo "All URLs written."
echo ""
echo "Entries in file:"
wc -l urls.log
echo ""
echo "Now downloading all those images from:"
echo ""

while read u; do
echo $u
curl -s -O $u
sleep 2
done < urls.log

echo ""
echo "All images downloaded."
echo ""

echo "Now back in directory:"
cd -

echo ""
echo "done."

exit 0


The resulting file contains all the URLs for safeguarding purposes, however the images have already been downloaded by the script before...

http://muslib.ru/pb/0/1828/gus-gus_1757356.jpg
http://www.wma.com/groove%5Farmada/imgs/groove_armada_logo_2.jpg
http://www.freevector.com/site_media/music-portraits/freevector-music-portraits-13-tiga.jpg
http://upload.wikimedia.org/wikipedia/commons/3/3d/Beth_Ditto_IMG_5491.jpg
http://www.namesinlights.co.uk/N/The_Name_Naomi.jpg
http://www.eventim.de/obj/media/DE-eventim/galery/kuenstler/h/helge-schneider-10-2009-003.jpg
http://1.bp.blogspot.com/-i0Gu5VAJUDk/T_GS_1KxAEI/AAAAAAAABAg/8tTsqg-82b0/s1600/f_06-04-Tipsarevic-Janko13.jpg
http://cdn.blogosfere.it/mondoauto/images/teslamodels.jpg
http://3.bp.blogspot.com/-s_qq_Kgc8TA/TWqSaFm-7NI/AAAAAAAAArU/vLa9ak_2YOQ/s1600/idiot.jpg
http://1.bp.blogspot.com/-kMY8b2bO0kU/T4zaRkE0_eI/AAAAAAAAJJU/IDO3sfQiNpk/s1600/bob-sinclar-23.jpg
http://cdn.luxatic.com/wp-content/uploads/2013/02/BMW-3-Series-Gran-Turismo-13-1024x682.jpg
http://www.richmondcenterstage.com/sites/default/files/eventimages/Gabriel-Iglesias-cs_0.jpg
http://www.celebritiesheight.com/wp-content/uploads/2012/02/Dave-Chappelle.jpg
http://deerwaves.com/wp-content/uploads/2012/01/dj-shadow.jpg
http://1.bp.blogspot.com/_8I2h3FR6qRY/SmpkL269W6I/AAAAAAAABAg/02qHLnCZtI4/s400/The-Rapture-band-ga02.jpg
http://todofondosdeseries.com/wp-content/uploads/images/b4/futurama.jpg
http://2.bp.blogspot.com/-xZTsre6r-gw/UC87PKFIqyI/AAAAAAAAEb8/1mCitTB8gCI/s1600/tricorne_simple.jpg
http://www.musik-base.de/images/artists/0000/0242/Sven-Vaeth_huge.jpg
http://images5.fanpop.com/image/photos/31600000/-Isabel-isabel-lucas-31607233-1920-1080.jpg
http://schoensten.janeschindler.ch/cms/sites/default2/files/styles/schoen-img__newmobile/public/_dsc2944_1223.jpg?itok=YGCneSqk
http://images4.fanpop.com/image/photos/17700000/Charlie-Sheen-charlie-sheen-17776967-1024-768.jpg
http://www.frenchbeats.fr/wp-content/uploads/2012/04/462228_324249664295556_146324102088114_759866_738473274_o.jpg
http://images6.fanpop.com/image/photos/32200000/The-King-of-Queens-the-king-of-queens-32207141-1440-1080.jpg
http://farm2.staticflickr.com/1203/868350284_4f25107825.jpg
...


All the results in one montage:

ImageJ

Now I can process all the images with some automated batch jobs. These are based on Lev Manovich's Software Studies Lab in combination with the mighty open-source image manipulation software ImageJ. The process is as follows: First, all images are getting analyzed on basic properties, afterwards I can use the slice script to arrange various combinations of stripes or slicess.

Results

A selection of those slices:

Friends

As well as the same procedure, however with using image results of my Facebook friend's names instead of ad topics:

One of the ideas at that stage was to put it up as interactive installation on the wall. A mock-up of it: