Been working with Hadoop (2.4.0) and Hive (0.13.0) with HDInsight (3.1) and it decompresses GZIP files into CSV by default.  Nice!  So, loading data with a Hive query in Powershell:

$response = Invoke-Hive -Query @"

LOAD DATA INPATH 'wasb://$container@$storageAccountName.blob.core.windows.net/file.csv.gz' 

INTO TABLE logs;


"@ 

No additional work or arguments to pass. I thought I had to do something like specified in this post with the io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec but apparently not.

 

UPDATE: Just found this link: https://cwiki.apache.org/confluence/display/Hive/CompressedStorage which goes into keeping compressed data in Hive which has a recommendation to create a SequenceFile.