mrbellek/twitterbot

Stars: 143

Forks: 43

Pull Requests: 2

Issues: 23

Watchers: 16

Last Updated: 2020-10-21 09:15:15

#twitter #twitter-api #twitter-oauth

Several PHP scripts for making Twitter bots that retweet certain terms, or post from a data source (rss, database, markov body, picture folder).

License: Other

Languages: PHP, CSS, JavaScript, HTML, Shell

Readme

Original readme:

This is a small php script which can be used to create twitter bots.
I've used two files (twitteroauth.php and OAuth.php) of Abraham's
twitteroauth project to create this app. It works on OAuth, so
no need to enter your twitter password.

Instruction to use this script can be found here:
http://www.prasannasp.net/how-to-create-a-twitter-bot/

(Note: this repo is being refactored to be more modular and object oriented. New scripts are in the /oo folder)

Introduction

This repository has several different types of Twitter bots with live examples:

retweetbot.php - a script that searches Twitter for certain terms and retweets them if they meet certain conditions - example: @FoundDildo
tweetbot.php - a script that fetches a single record from a database and tweets it, after formatting - example: @NotesFromBTC
rssbot.php - a script that fetches a RSS feed and tweets items from it, after formatting - example: @r_Buttcoin
markovbot.php - a script that parses a body of text and generates tweets from it based on Markov Chains - example: @stallman_txt
favbot.php - a script that searches Twitter for certain terms and favorites them if they meet certain conditions (like retweetbot, but with favs)
picturebot.php - a script that indexes a given folder with pictures and tweets them out one at a time

All scripts assume a basic knowledge of PHP. Every of these bots is used by creating its object and passing arguments in the constructor. None of the scripts should need to be altered to get your bot working.

Setting up a bot

Each of these bots is constructed differently, but there are common steps. They are:

Create a Twitter account and app for the bot

Sign up for Twitter with a new account that will be used to post tweets from. Then, go to apps.twitter.com and create a new app. Enter whatever you want for Name/Description/Website and create the app. Now go to Permissions and change it to Read and Write. Finally, go to the Keys and Access Tokens tab and copy your Consumer Key and Secret. Click the button to generate Access Tokens and copy those too.

Create a PHP script

Next, start a blank PHP file and define constants for CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN and ACCESS_TOKEN_SECRET. Also, create a MYPATH constant with the path the script will run from (especially important on shared hosting).

The final step differs for each type of bot, but the basic idea is to require() the bot script of your choice, and create a new object (TweetBot, RetweetBot, RssBot, MarkovBot), passing it any arguments it needs. Call run() on the object to make it do its thing. Call your script periodically from a cronjob, between 1 and 4 times per hour.

See any of the scripts in the examples/ folder for details on the bot parameters. (Obviously, the file with the secrets and tokens is not included here.)

Retweet bot

The retweet bot will search Twitter for given terms, and retweet anything that meets the set conditions.

Features:

native retweets instead of 'RT @username ...' posts
will not retweet users on 'blocked users' list
will not retweet tweets matching given keywords
will not retweet usernames matching given keywords
keep track of API rate limit status and not exceed it
multiple possible search queries per bot

Required arguments:

sUsername - the username of the Twitter account the bot will be posting from
aSearchStrings - an array of strings, each is a single search query. You can use boolean logic here to exclude unwanted things early

Optional arguments:

iSearchMax - maximum number of tweets to fetch during each search query (default 5)
iMinRateLimit - the threshold that will stop the bot from tweeting if the rate limit falls below this, and wait until the rate limit reset (default 5)
sSettingsFile - filename of the .json file with settings: keyword filters, username filters, probability values (default blank)
sLastSearchFile - filename of the .json file that stores what tweet was last parsed, to continue from in the next run (default <username>-last<num>.json)
sLogFile - filename of the .log file holding and logger messages that are generated (default <username>.log)

The search strings array allows for multiple search queries for a single bot, all of which will be executed during a run. For instance, if you want your bot to retweet anything mentioning apples OR oranges, you could pass it array(1 => 'apples', 2 => 'oranges').

The settings .json file holds several filters:

Search term filters, listing everything you don't want retweeted even if it satisfies the search query. For instance, if you want your bot to retweet any mention of apples, but not Granny Smith apples, your settings file would contain { "filters": [ "granny smith" ] }.
Username filters, listing whole or parts of usernames you never want to retweet. For instance, to never retweet Applebees, your settings file would contain { "userfilters": [ "applebees" ] }.
Dice values, listing the probability (0-1) of retweeting tweets with images, urls, mentions or none of these. For instance, if you want to always retweet tweets with pictures but never tweets that are mentions, your settings file would contain { "dice": { "media": 1.0, "mentions": 0.0 }}. The default values for the dice are 1/0.8/0.5/0.8 for tweets with pictures/urls/mentions/none, respectively.

Tweet bot

The tweet bot will fetch a semi-random record from a database, format it, and tweet it.

Features:

will keep track of how often a database record has been posted, and always pick the record that was posted least often. If multiple records are found, a random one is picked.
formatting the database record is quite powerful and flexible

Required arguments:

sUsername - the username of the Twitter account the bot will be posting from
aDbVars - an array specifying the table name and column names in the database
sTweetFormat - the format of the tweet to be posted, with placeholders for the data fetched from the database
aTweetVars - an array specifying which placeholder in the tweet format will be filled by which database variable

The aDbVars array consists of the keys sTable, sIdCol, sCounterCol and sTimestampCol indicating what name the database table and database columns have for record id, post count for that record, and timestamp of when that record was last posted. For instance, if your table is named APPLETYPES with columns ID, NAME, GROWN, HITS and WHEN, your array would be array('sTable' => 'APPLETYPES', 'sIdCol' => 'ID', 'sCounterCol' => 'HITS', 'sTimestampCol' => 'WHEN'). The column with the actual content that will be tweeted is not set here.

The sTweetFormat string contains the generic tweet format with colon placeholders, e.g. if your bot tweets the different types of apples that exist, the tweet format could look like The best type of apple is <a href="http://en.wikipedia.org/wiki/:apple">:apple</a>, first grown in :grown!.

Finally, the aTweetVars array ties the placeholders in the tweet format to the colum names from your database. In the above examples we've established, you'd want the value of the NAME column to fill the :apple placeholder, and the GROWN column value in the :grown placeholder. In this case, the array would look like array( array('sVar' => ':apple', 'sRecordField' => 'NAME'), array('sVar' => ':grown', 'sRecordField' => 'GROWN')).

In the case that the tweet would exceed the 140 character limit, you can also specify which placeholder should be truncated to make it fit with the bTruncate setting in aTweetVars (only 1 field can be truncated). For instance, if the first grown date of the apple in our example bot would be too long, you could change its variable to array('sVar' => ':grown', 'sRecordField' => 'GROWN', 'bTruncate' => TRUE). If no field is specified to be truncated and the tweet is too long, or the tweet is still too long after truncating, the tweet is not posted. Note that URLs of any length take only 22 (or 23 for https URLs) characters in a tweet.

RSS bot

The RSS bot will grab an RSS feed, and post any new items in it according to a preset format.

Features:

will keep track of newest item posted, to avoid posting items twice
powerful and flexible formatting of RSS item into tweet by way of regular expressions
optionally attach image links to tweet as embedded media

Required arguments:

sUsername - the username of the Twitter account the bot will be posting from
sUrl - the URL of the RSS feed
sTweetFormat - the format of the tweet to be posted, with placeholders for the data fetched from the RSS feed
aTweetVars - an array specifying which placeholder in the tweet format will be filled by which RSS variable

Optional arguments:

sLastRunFile - filename of the .json file that stores what item was last parsed, to continue from in the next run (default <username>-last.json)
sTimestampXml - name of the XML field in the RSS feed that specifying when the item was published (default pubDate)

The sTweetFormat string contains the generic tweet format with colon placeholders, e.g. if your bot tweets the title, link and timestamp from every item in an RSS feed, the format could look like :title :link (published on :date). These names do not have to be the same as the XML fields in the RSS feed.

The aTweetVars array ties the placeholders in the tweet format to the XML fields in the RSS feed. Regular expressions can be used to grab specific parts of XML fields and tie them to placeholders, or entire contents of an XML field can be used. If a tweet turns out too long, a single variable can be marked for truncating if need be. For instance, if the RSS feed lists <item> nodes that contain <title>, <pubDate> and <source> nodes, your array could look like array( array( 'sVar' => ':title', 'sValue' => 'title'), array('sVar' => ':link', 'sValue' => 'source'), array('sVar' => ':date', 'sValue' => 'pubDate')). (Nested nodes are not supported yet.)

However, if the <source> node, for whatever reason, instead of just the link has the text <a href="http://....'>click here!</a>, you would want just the link and not the whole node text. To accomplish this, the variable would change to e.g. array('sVar' => ':link', 'sValue' => 'source', **'sRegex' => '/href="(.+?)">/'**).

A special value can be passed to the sValue key, to further process another variable into a result that fills the placeholder. Right now, the only value is special:redditmediatype, which determines if the value of a variable with a link is an image, a video, a self post or a cross-link. This only applies to Reddit RSS feeds. For example, to add another variable in our example that would fill a placeholder with the link type: array('sVar' => ':type', 'sValue' => 'special:redditmediatype', 'sSubject' => ':link'). After this, adding :type in the tweet format would fill it with the type of link in the :link placeholder.

Finally, the option is available to attach image links in variables directly to tweets as media. This means the picture shows up inline to the tweet on the Twitter website, as a pic.twitter.com link. The option is bAttachImage, so the link from our example would be array('sVar' => ':link', 'sValue' => 'source', 'bAttachImage' => TRUE). If the variable does not contain a valid image URL, the option is ignored. Only one image per tweet is allowed.

Markov bot

The Markov bot will grab a body of text, generate Markov Chains from it, and then generate a new sentence from those that fills a tweet, and post it.

Markov Chain analysis of text creates an array of which word is likely to follow two or more other words. This way, new sentences can be generated that contain snippets from the original body of text that sorta look like it came from the original text. It works best if the source body of text is (very) large and has long sentences (i.e. tweets don't work very well).

Required arguments:

sUsername - the username of the Twitter account the bot will be posting from
sInputFile - the filename of the source body of text

[to be finished later]

Picture bot

This bot type will scan a folder filled with pictures and pick a random one to attach to a tweet. The tweet itself can use placeholders that are filled with properties of the chosen file, like filename, subfolder, file size, file type, etc. The folder is scanned recursively, so it can contain files as well as folders with files. The script will keep track of how many times each files was posted and always pick the one that was posted the least. You can opt to have it stop after every file was posted, or let it keep going indefinitely.

The available placeholders to use in the tweet format are:

filepath - the full path + filename of the picture, including document root path
dirname - the subdirectory the picture is in
filename - filename of the picture
basename - filename of the picture, excluding extension
extension - extension of the picture filename
size - file size of the picture, in kilobytes
width - width of the picture, in pixels
height - height of the picture, in pixels
created - date when the file was created, yyyy-mm-dd
modified - date when the file was last modified, yyyy-mm-dd

Required arguments:

sUsername - the username of the Twitter account the bot will be posting from
sPictureFolder - the folder containing all the pictures
sTweetFormat - format of the tweets

Optional arguments:

sSettingsFile - the filename of the settings file. default: [twitter username].json
iMaxIndexAge - how often to rescan the picture folder for changed files. default: 24h
bPostOnlyOnce - post all files once, then stop. default: false
sLogFile - the filename of the log file. default: [twitter username].log

Finishing up your Twitter bot

If all went well, you should have a small script that invokes one of the above base scripts, depending on the type. If you run that script, your bot will run once, and post one tweet. Once that works, the final step is to upload those files to your webhosting provider, and setup a task (a cronjob) to run that script periodically. Consult your provider's documentation on how to do this.

OPEN ISSUES

See all

Always load db lib in libraries by @mrbellek
Implement config setting allow_mentions by @mrbellek
Convert from OAuth.php/twitteroauth to abraham/twitteroauth library by @mrbellek
Fix so that lib/format respects the config setting attach_image by @mrbellek
Remove content from tweet that's already attached to it by @mrbellek
Logview page issues by @mrbellek
Convert to OO: exactotweet by @mrbellek
Get gif/movie attachments working with new twitteroauth lib by @mrbellek
Turn off output for cron by @offtrolley
Update logger documentation by @mrbellek
Update Media lib to support Reddit image galleries by @mrbellek

RELEASES

See all

mrbellek/twitterbot

octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true">Readme

octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true">(Note: this repo is being refactored to be more modular and object oriented. New scripts are in the /oo folder)

octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true">Introduction

octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true">Setting up a bot

octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true">Create a Twitter account and app for the bot

octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true">Create a PHP script

octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true">Retweet bot

octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true">Tweet bot

octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true">RSS bot

octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true">Markov bot

octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true">Picture bot

octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true">Finishing up your Twitter bot

OPEN ISSUES

RELEASES

Readme

(Note: this repo is being refactored to be more modular and object oriented. New scripts are in the /oo folder)

Introduction

Setting up a bot

Create a Twitter account and app for the bot

Create a PHP script

Retweet bot

Tweet bot

RSS bot

Markov bot

Picture bot

Finishing up your Twitter bot