February 15, 2010

charles web proxy review

As I said in my last post, I’ve written many a scraper using php with curl or fsockopen in my time, trying to write automated tools and scraping data. I’ve tried many tools to help me sniff the HTTP traffic so I could emulate it in PHP as quick as possible. I started off using Wireshark or Ethereal as it was called at the time which was complete overkill, mostly used for network trouble shooting and grabs all TCP/UDP packets which is information overload, all we want is HTTP data. Then I think I used the LiveHTTPheaders addon for Firefox which was pretty limited. Then a Java program called Burpsuite which was pretty powerful but I ran into a problem trying to automate myspace myads submissions, trying to figure out what HTTP the myads flash file was sending over HTTPS. I ran the gamut of every proxy tool out there until I came across Charles Web Proxy.

It’s basically the best out there. It sits as a proxy between the web and your browser, grabbing all data as it comes in. This usually causes problems with SSL but it has a custom SSL cert that you manually add to your browser that lets you log HTTPS data with no warnings. It can grab Flash traffic as it seems to work as a Windows proxy, not just a browser one. It presents HTTP data many different ways so you can understand what’s going on quicker. For example, a multipart form upload is presented as the the raw HTTP data sent, just the headers, just the cookies, the text body and all the form fields. I won’t list all the features as they’re all listed on the site. If you’re using any other tool for automation/scraping, you’re wasting time.

February 13, 2010

php curl debugging - seeing the exact http request headers sent by curl

In my many of years of php/curl use, I’ve hammered my head off my table countless times trying to debug scripts that weren’t emulating the browser like it was supposed to. This was pretty hard without seeing the exact HTTP request header sent by cURL each session, but this is possible now from PHP 5.1.3

Use the curl_getinfo php function with the CURLINFO_HEADER_OUT option but make sure to set option CURLINFO_HEADER_OUT to true as a curl option.

$ch = curl_init("http://www.google.com");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLINFO_HEADER_OUT, true);
$get = curl_exec($ch);
$info=curl_getinfo($ch,CURLINFO_HEADER_OUT);
var_dump($info);

December 10, 2009

raygun 4play interview video

I remember seeing this ridiculous interview for a band called Raygun on Graham Linehan’s blog a few months. I went looking for it again and noticed Sony had forced most copies to be taken down! But I found this one on Youtube and decided to post on my site for safekeeping. My favourite bit is ‘they couldn’t even find me a job in a record store’. LOL. Sums up this coddled generation.

August 12, 2009

how to copy a website with httrack on linux

This is more for my own reference than anything. Say you see a flog on the intertubes and want to rip it and stick up for affiliate links. How to do it quickly on Linux? I used to use wget but it sucked. httrack is much better.

httrack "http://www.techcrunh.com/" -N1 -O "/home/techcrunch_rip/public_html" +techcrunch.com/* +crunchgear.com/* -v

This will rip the homepage of techcrunch and stick it in the folder specified by -O. URL filters next ensure it only downloads files from certain domains. The -N1 argument is the most important, it ensures htttrack sticks all images, css in one directory instead of creating loads of directories. Very handy.

July 1, 2009

oxegen 2009 stage times released

Oxegen 2009 Stage Times

Taken from Oxegen forum. Considering going on the Saturday myself.

June 27, 2009

schalk burger eye gouge on lions luke fitzgerald

Should have been red carded the scumbag.

June 3, 2009

irish phonebook on your iphone

I was looking for an iPhone app a while ago to search Irish business phonenumbers and couldn’t find one, so wrote one myself. And while I was waiting for Apple to approve my app, a different phonebook app was released with better user interface! BUT it just searches the goldenpages website so you need a net connection. I scraped the Goldenpages website and stuck it in the app, so no net connectio needed, handy when you quickly need a number.

Check it out here, only e2.39 to buy

April 2, 2009

diggbar

Diggbar just release which is a URL shortening service as well with full, do-follow links. Wonder how long it lasts.

February 4, 2009

php facebook ads api

Sick of waiting for the facebook ads API? Download my PHP one today. Features include:

  • Create ads from PHP script loop. You can modify any paramaters you want and submit 100s of ads an hour but that will prob get your account banned
  • Pull info from your DB to submit ads. eg You could pull artist names and submit loads of ringtone ads using each individual artist
  • I provide a mysql table of all US cities you can target with Facebook, along with the user count for each city. So you can loop through all these and create targetted ads to every city in the US. eg ‘Meet Atlanta, GA Women’ today targetted at just Atlanta, GA demo. This is sure to improve CTR and lower your CPC.
  • Support for proxies

This is the code you use to create ads. #!/usr/local/bin/php
// Put your mysql details in the follwing line
$mysql= new mysqli("localhost","username","password","db");
include("facebook.php");
$facebook = new Facebook();
$facebook->setLogin(”yourfacebookemail”,”yourfbpassword”);
$facebook->getHomeCookie();
$facebook->signIn();
$facebook->getAdsHome();
$facebook->locationtype=”city”;
// The script doesn’t support creating campaings yet so you need to get this from ads Manager. On the create an ad page, the select box at the bottom where you pick your campaign, just go into the HTML source and find the value for the campaign you want to use.
$facebook->campaignid=”campaignid”;
// Not sure if these 2 values make a difference. No harm in setting them right.
$facebook->campaignbudget=”500.00″;
$facebook->dailybudget=”500.00″;

$query1=$mysql->query(”SELECT * FROM us_cities WHERE done=’no’ ORDER BY usercount DESC”);
while ($result1=$query1->fetch_assoc()) {
$insertid=$result1[”id”];
$fbid=$result1[”facebook_id”];
$city=$result1[”city”];
preg_match(”/(.+?), .+?/”,$city,$justcity);
// this uploads image used in ad.
$facebook->uploadFacebookImage(”image.jpg”);
// this actually creates and approves the ad. works like this createAd(ad name in ad manager, ad url without the http://, ad title (keep less than 25), ad body (keep less than 135), country, min age, max age, cpc bid, city id, gender targeted, education targeted) $JUSTCITY[1] has the name of just the city like Atlanta, $city has the state as well eg Atlanta, GA
$facebook->createAd($city,”www.yoururl.com/index.html?id={$insertid}”,”{$justcity[1]} Free Grants”,”Our records show 100s of unclaimed grants issued by Pres. Obama available to {$city} residents. You need to claim now.”,”US”,”21″,”25″,”0.39″,$fbid,”male”,”all”);
$query2=$mysql->query(”UPDATE us_cities SET done=’yes’ WHERE id=’{$insertid}’ LIMIT 1″);
// this is the value of seconds waited before submitting each ad. suggest keep it at 15 unless you want your account banned
sleep(15);
}
?>

Here’s a screenshot of the facebook cities table:Screenshot of phpMyAdmin

Price is $200. If you want a copy, paypal the cash to filmfind@eircom.net and I’ll email you a copy within a few hours.

January 21, 2009

Global Media Pro - BUYER BEWARE

Do not buy any gear off globalmediapro, they’re dodgy as fuck. You pay them through IBAN bank transfer which is European, yet goods are shipped from locations all around the world, Singapore, China and Japan. They claim to be 3 or 4 days delivery but it took us 3 weeks. When the goods arrived by courier, we’d to pay nearly double the price in customs to get the stuff which we thought were coming from the EU.

We emailed and phoned to complain but with no response. They’re dodgy, do not deal with them. Loads of people having problems with them as well. And word is they threaten to sue people who complain on forums.

And if Globalmediapro is reading this and thinking of suing, you can find my address by whois-ing this domain. Thanks