Is is possible to use SQOOP from within Java to read/write from MySQL to/from Hadoop jobs?

Question 1

Thanks Charles and Vikas. This certainly put me on the right track. I ended up using https://github.com/cwensel/cascading.jdbc which uses Hadoop classes DBInputFormat/DBOutput to make it easy to set up Cascading jobs that read and write to db.

To write I just changed the output flow of my tap to:

String url = "jdbc:mysql://localhost:3306/mydb?user=myusername&password=mypassword";
String driver = "com.mysql.jdbc.Driver";
String tableName = "mytable";   
String[] columnNames = {'col1', 'col2', 'col3'}; //Columns I want to write to 
TableDesc tableDesc = new TableDesc( tableName );

JDBCScheme dbScheme = new JDBCScheme( columnNames );
Tap dbOutputTap = new JDBCTap( url, driver, tableDesc, dbScheme );

And to read from the db I just made a tap that looked like this:

String url = "jdbc:mysql://localhost:3306/mydb?user=myusername&password=mypassword";
String driver = "com.mysql.jdbc.Driver";
String tableName = "mytable";      
String[] columnNames = {'col1', 'col2', 'col3'}; //Columns I want to read from 
TableDesc tableDesc = new TableDesc( tableName );

JDBCScheme dbScheme = new JDBCScheme( columnNames, "col1<40" );
Tap dbInputTap = new JDBCTap( url, driver, tableDesc, dbScheme );

I came across Cascading-DBMigrate as well but it seems this is only for reading from db's and not writing to them.

Question 2

Sqoop is designed for exporting/importing data between MySQL/other relational databases and Hadoop/HBase. A very good tutorial on sqoop can be found here which explains its various functionalities. Not sure if this is what you want to do.

In case you need to read/write data from/to MySQL in MapReduce jobs, DBInputFormat/DBOutput hadoop classes can be used as suggested by @Charles

Question 3

If you just want to write your job output to MySQL, I would recommend using a different output format called DBOutputFormat as described here:

A companion class, DBOutputFormat, will allow you to write results back to a database. When setting up the job, call conf.setOutputFormat(DBOutputFormat.class); and then call DBConfiguration.configureDB() as before.

The DBOutputFormat.setOutput() method then defines how the results will be written back to the database. Its three arguments are the JobConf object for the job, a string defining the name of the table to write to, and an array of strings defining the fields of the table to populate. e.g., DBOutputFormat.setOutput(job, "employees", "employee_id", "name");.

The same DBWritable implementation that you created earlier will suffice to inject records back into the database. The write(PreparedStatement stmt) method will be invoked on each instance of the DBWritable that you pass to the OutputCollector from the reducer. At the end of reducing, those PreparedStatement objects will be turned into INSERT statements to run against the SQL database.

Where "as before" refers to this instruction:

DBConfiguration.configureDB(conf, “com.mysql.jdbc.Driver”, “jdbc:mysql://localhost/mydatabase”);

To read from MySQL it's all the same with DBInputFormat.