In comparison with the previous BD|CESGA platform these are the main improvements:
Hadoop is now upgraded to Hadoop 3.
Spark 2.4 is now the default version.
HDFS Erasure coding: allows to reduce storage overhead over default 3x replication.
Impala is now available as an alternative to Hive for interactive SQL queries.
The HOME system has been migrated from GlusterFS to the new Netapp storage system, this has greatly improved the latency of the HOME filesystem.
The HDFS NameNode is now in HA configuration.
The YARN ResourceManager is now in HA configuration.
SSL/TLS is now enabled for more secure communications.