Big Data Processing with Hadoop and Hive on top of the Google DataProc Service
Big Data Processing with Hadoop and Hive on top of the Google DataProc Service Offering  Business ScenarioOne of our clients wanted us to evaluate Hadoop for data processing of about 5000 Shops. The...
View ArticleHive Big Data Commands Reference
DELIVERBI Big Data Hive Command Reference  Today we are sharing some of the frequently used Hive commands and settings that will come handyTo see the status of the Hive Server2 and also view the logs...
View ArticleScripting Hive Commands with Python
 Scripting Hive Commands with Python In the previous posts, we touched upon basic data processing using Hive. But it is all interactive. We will discuss how to script these Hive commands using...
View ArticleGoogle Big Query Clone / Copy DataSets / Tables - PROD --> TEST --> DEV Projects
Google Big Query Cloning of Datasets & Tables across GCS Projects.We searched the internet and could not find a simple cloning / copying of Tables and Datasets script from PROD --> TEST -->...
View ArticleHive Tez vs Presto 12 Billion Records and 250 Columns Performance
Hive Tez vs Presto as a query Engine & Performance12 BILLION ROWS AND a 250 Column Table with Dimensions. Here at DELIVERBI we have been implementing quite a few Big Data Projects. At one of our...
View ArticlePowerBI Custom Direct Query ODBC Connector
PowerBI Custom Direct Query ODBC ConnectorAt DELIVERBI we have various challenges presented to us and a client of ours uses Power BI as their main reporting and analytics tool and wanted Direct Query...
View ArticleHive HPLSQL setup on Google DataProc
Hive HPLSQL setup on Google DataProcGoogle Dataproc Hadoop and HiveHive Version : 2.3.2 (Version Supports Tez Engine)More Information on...
View ArticleArticle 0
Google Drive API Google Sheets Extract to CSV with SERVICE KEY.JSONRecent Client had some Google Sheets stored on their Google Drive and needed them processing into the Hadoop Cluster and available in...
View ArticleApache Airflow 1.10.2 Maintenance Dags
Apache Airflow 1.10 / 1.10.2 + Maintenance Dags (Logs)We have noticed on Airflow 1.10.+ , The maintenance dags to manage database and file system logs need to be updated to accomodate extra tables...
View ArticleHADOOP Balancing the cluster
Hadoop Balancing the ClusterBalancing an hadoop cluster i feel is very important and the nodes should have a deviation of no more than 1%. I just feel as though this helps in having a healthy hadoop...
View ArticleYarn - Long Running Jobs Alerts for Hive TEZ , Application Containers
Hadoop / Hive / Yarn / Applications Long running Jobs alert scriptCurrently at a client where their are some long running jobs in hive that hog the resources , You can see the logs in the application...
View ArticleApache Airflow Check Previous Run Status for a DAG
Apache Airflow Check Previous Run Status for a DAGThe ProblemWe encountered a scenario where if any previous DAG run fails the next scheduled DAG run should not proceed. Checked for the solution and...
View ArticleGCS HADOOP/Hive to Google Big Query Migrate Data Quickly ORC Format
GCS Dataproc Hadoop(Hive) to GCS Big Query (BQ) Quick Transfer ORC File FormatsSoftware UsedGoogle Dataproc - Google's Hadoop cluster with add ons such as spark etc.GCS (Google Cloud Storage) - Buckets...
View ArticleGCP Google BigQuery Cost Reporting - Control your costs
Setting up the BigQuery audit logs export within GCSMonitor Your BigQuery Costs in GCPGoogle Cloud Platform InstructionsGoogle Big Query is an excellent database but costs will have to be monitored...
View ArticleHappy New Year 2020 from DELIVERBI and DataRocketPro
Happy New Year to all our friends and customers. Hope you have a prosperous new year. Remember we are now a google partner and can help with all your Google Cloud Platform requests. As usual any...
View ArticleGoogle Partner for 2020
DELIVERBI | Google Cloud Partner ready for 2020Data Rocket Pro| Watch the Product Video for 2020 Here |
View ArticleTable or partition location information from Apache Hive
How to list table or partition location from Hive MetastoreGoal:This article provides the SQL to list table or partition locations from Hive Metastore.Env:Hive metastore 0.13 on MySQLRoot Cause:In Hive...
View ArticleHDFS Namenode ,FsImage,Editlogs Backup And Restore
HDFS Namenode ,FsImage,Editlogs Backup And RestoreHow to perform HDFS metadata backup:Backing up HDFS primarily involves creating a latest fsimage and fetching & copying it to another DR location....
View ArticleJenkins vs Spinnaker Continuous Deployment
Jenkins vs Spinnaker: What are the differences?Continuous delivery (CD) platforms help DevOps teams manage software releases in short cycles, ensuring they are delivered safely, reliably and quickly....
View ArticleHive - Setting Up Hive Admin User - GCP Dataproc
HIVE ADMIN User setup - GCP etc 👀We had issues assigning the hive admin user role to our admin users. We googled everything and the only solutions out their would not work. So we created our own that...
View ArticleOracle to GCP Google Cloud Data Migration
 Oracle to Google Cloud Platform Data MigrationWe have had many clients approach us for different reasons of moving from Oracle on premise database solutions to the google cloud platform. Here at...
View ArticleGCP Dataproc HDFS Error on Creation of Cluster
Google Dataproc HDFS Error when spinning up a clusterError You will EncounterOn creation of Google Dataproc Cluster - You will get the following error this can be dueto a bad character in a username in...
View ArticleTrino Graceful Shutdown on GCP Using Instance Groups
 I was setting up a Trino cluster for one of my clients on GCP and shutting down nodes causes query errors. I have been using Trino previously called Presto on GCP now for over 4 years. Its an amazing...
View ArticleCreating a Basic Trino Service to Start On Demand Clusters for ADHOC Large...
 Creating a Basic Trino Service to Start On Demand Clusters for ADHOC Large ETL Jobs on GCP Google Cloud Platform using Python Flask. We were working on a client site where a customer of ours wanted to...
View ArticleGoogle Big Query (BQ) Results to GCS Storage Bucket Folder
Export data from BQ to a GCS(Google Cloud Storage) bucket.The below example will export a CSV file output to the bucket location in the "uri" below.declare unused STRING;export data...
View Article