What’s new

In comparison with the previous BD|CESGA platform these are the main improvements:

  • Hadoop is now upgraded to Hadoop 3.

  • Spark 2.4 is now the default version.

  • HUE 4.

  • HDFS Erasure coding: allows to reduce storage overhead over default 3x replication.

  • Impala is now available as an alternative to Hive for interactive SQL queries.

  • The HOME system has been migrated from GlusterFS to the new Netapp storage system, this has greatly improved the latency of the HOME filesystem.

  • Improved reliability:

    • The HDFS NameNode is now in HA configuration.

    • The YARN ResourceManager is now in HA configuration.

  • Improved security:

    • SSL/TLS is now enabled for more secure communications.