Ok, so I have a hive table on a remote hadoop node set up on a linux machine. I'm having an issue when attempting to insert a large json string, large as in possibly 64MB or more given that map reduce won't work well unless I approach that limit. I've successfully transfered over 8 - 9MB, but that's as high as it gets, if I attempt to do more than the query fails. I also had to override C#'s default json serializer to do this, not a good practice I know, but I really don't know any other way to do this.
Anyway this is how I store data into Hive:
namespace HadoopWebService.Controllers
{
public class LogsController : Controller
{
// POST: HadoopRequest
[HttpPost]
public ContentResult Create(string json)
{
OdbcConnection hiveConnection = new OdbcConnection("DSN=Hadoop Server;UID=XXXX;PWD=XXXX");
hiveConnection.Open();
Stream req = Request.InputStream;
req.Seek(0, SeekOrigin.Begin);
string request = new StreamReader(req).ReadToEnd();
ContentResult response;
string query;
try
{
query = "INSERT INTO TABLE error_log (json_error_log) VALUES('" + request + "')";
OdbcCommand command = new OdbcCommand(query, hiveConnection);
command.ExecuteNonQuery();
command.CommandText = query;
response = new ContentResult { Content = "{status: 1}", ContentType = "application/json" };
hiveConnection.Close();
return response;
}
catch(Exception error)
{
response = new ContentResult { Content = "{status: 0, message:" + error.ToString()+ "}" };
System.Diagnostics.Debug.WriteLine(error.Message.ToString());
hiveConnection.Close();
return response;
}
}
}
}
Is there some setting which I can use to insert larger amounts of data? I assume there must be some buffer that is failing to load everything. I've checked on google but I haven't found anything, mainly because this probably isn't the way to insert properly into Hadoop, but I'm really out of options right now, I can't use HDInsight, all I've got is the ODBC connection.
EDIT: This is the error I get:
System.Data.Odbc.OdbcException (0x80131937): ERROR [HY000][HiveODBC] (35) Error from Hive: error code: ‘0’ error message: ‘ExecuteStatement finished with operation state: ERROR_STATE’.
message:System.Data.Odbc.OdbcException (0x80131937): ERROR [HY000] [Microsoft][HiveODBC] (35) Error from Hive: error code: '0' error message: 'ExecuteStatement finished with operation state: ERROR_STATE'. at System.Data.Odbc.OdbcConnection.HandleError(OdbcHandle hrHandle, RetCode retcode) at System.Data.Odbc.OdbcCommand.ExecuteReaderObject(CommandBehavior behavior, String method, Boolean needReader, Object[] methodArguments, SQL_API odbcApiMethod) at System.Data.Odbc.OdbcCommand.ExecuteReaderObject(CommandBehavior behavior, String method, Boolean needReader) at System.Data.Odbc.OdbcCommand.ExecuteNonQuery()
User contributions licensed under CC BY-SA 3.0