Gartner Identifies Top 10 Data and Analytics Technology Trends for 2019

Trend No. 1: Augmented Analytics

Augmented analytics is the next wave of disruption in the data and analytics market. It uses machine learning (ML) and AI techniques to transform how analytics content is developed, consumed and shared.

By 2020, augmented analytics will be a dominant driver of new purchases of analytics and BI, as well as data science and ML platforms, and of embedded analytics. Data and analytics leaders should plan to adopt augmented analytics as platform capabilities mature.

Trend No. 2: Augmented Data Management

Augmented data management leverages ML capabilities and AI engines to make enterprise information management categories including data quality, metadata management, master data management, data integration as well as database management systems (DBMSs) self-configuring and self-tuning. It is automating many of the manual tasks and allows less technically skilled users to be more autonomous using data. It also allows highly skilled technical resources to focus on higher value tasks.

Augmented data management converts metadata from being used for audit, lineage and reporting only, to powering dynamic systems. Metadata is changing from passive to active and is becoming the primary driver for all AI/ML.

Through to the end of 2022, data management manual tasks will be reduced by 45 percent through the addition of ML and automated service-level management.

Trend No. 3: Continuous Intelligence

By 2022, more than half of major new business systems will incorporate continuous intelligence that uses real-time context data to improve decisions.

Continuous intelligence is a design pattern in which real-time analytics are integrated within a business operation, processing current and historical data to prescribe actions in response to events. It provides decision automation or decision support. Continuous intelligence leverages multiple technologies such as augmented analytics, event stream processing, optimization, business rule management and ML.

“Continuous intelligence represents a major change in the job of the data and analytics team,” said Ms. Sallam. “It’s a grand challenge — and a grand opportunity — for analytics and BI (business intelligence) teams to help businesses make smarter real-time decisions in 2019. It could be seen as the ultimate in operational BI.”

Trend No. 4: Explainable AI

AI models are increasingly deployed to augment and replace human decision making. However, in some scenarios, businesses must justify how these models arrive at their decisions. To build trust with users and stakeholders, application leaders must make these models more interpretable and explainable.

Unfortunately, most of these advanced AI models are complex black boxes that are not able to explain why they reached a specific recommendation or a decision. Explainable AI in data science and ML platforms, for example, auto-generates an explanation of models in terms of accuracy, attributes, model statistics and features in natural language.

Trend No. 5: Graph

Graph analytics is a set of analytic techniques that allows for the exploration of relationships between entities of interest such as organizations, people and transactions.

The application of graph processing and graph DBMSs will grow at 100 percent annually through 2022 to continuously accelerate data preparation and enable more complex and adaptive data science. 

Graph data stores can efficiently model, explore and query data with complex interrelationships across data silos, but the need for specialized skills has limited their adoption to date, according to Gartner.

Graph analytics will grow in the next few years due to the need to ask complex questions across complex data, which is not always practical or even possible at scale using SQL queries.

Trend No. 6: Data Fabric

Data fabric enables frictionless access and sharing of data in a distributed data environment. It enables a single and consistent data management framework, which allows seamless data access and processing by design across otherwise siloed storage.

Through 2022, bespoke data fabric designs will be deployed primarily as a static infrastructure, forcing organizations into a new wave of cost to completely re-design for more dynamic data mesh approaches.

Trend No. 7: NLP/ Conversational Analytics

By 2020, 50 percent of analytical queries will be generated via search, natural language processing (NLP) or voice, or will be automatically generated. The need to analyze complex combinations of data and to make analytics accessible to everyone in the organization will drive broader adoption, allowing analytics tools to be as easy as a search interface or a conversation with a virtual assistant.

Trend No. 8: Commercial AI and ML

Gartner predicts that by 2022, 75 percent of new end-user solutions leveraging AI and ML techniques will be built with commercial solutions rather than open source platforms.

Commercial vendors have now built connectors into the Open Source ecosystem and they provide the enterprise features necessary to scale and democratize AI and ML, such as project & model management, reuse, transparency, data lineage, and platform cohesiveness and integration that Open Source technologies lack.

Trend No. 9: Blockchain

The core value proposition of blockchain, and distributed ledger technologies, is providing decentralized trust across a network of untrusted participants. The potential ramifications for analytics use cases are significant, especially those leveraging participant relationships and interactions.

However, it will be several years before four or five major blockchain technologies become dominant. Until that happens, technology end users will be forced to integrate with the blockchain technologies and standards dictated by their dominant customers or networks. This includes integration with your existing data and analytics infrastructure. The costs of integration may outweigh any potential benefit. Blockchains are a data source, not a database, and will not replace existing data management technologies.

Trend No. 10: Persistent Memory Servers

New persistent-memory technologies will help reduce costs and complexity of adopting in-memory computing (IMC)-enabled architectures. Persistent memory represents a new memory tier between DRAM and NAND flash memory that can provide cost-effective mass memory for high-performance workloads. It has the potential to improve application performance, availability, boot times, clustering methods and security practices, while keeping costs under control. It will also help organizations reduce the complexity of their application and data architectures by decreasing the need for data duplication.

“The amount of data is growing quickly and the urgency of transforming data into value in real-time is growing at an equally rapid pace,” Mr. Feinberg said. “New server workloads are demanding not just faster CPU performance, but massive memory and faster storage.”

More information on how to use data and analytics for competitive advantage can be found on the Gartner Data & Analytics Insight Hub.

https://www.gartner.com/en/newsroom/press-releases/2019-02-18-gartner-identifies-top-10-data-and-analytics-technolo

RSA 비대칭암호화를 위한 키 생성

리눅스 서버를 기준으로, 공개키와 개인키 생성에 대해 간략하게 설명한다.

  1. 개인키 생성
    – private.pem 이름을 가진 개인키를 생성할 경우 아래와 같이 한다.
    – 1024bit 길이의 키를 생성함 (bit길이가 길면 보안성 높음)
[root@moongk ~]# openssl genrsa -out private.pem 1024
Generating RSA private key, 1024 bit long modulus
................................................++++++
..........++++++
e is 65537 (0x10001)

  1. 공개키 생성
    – 개인키로 가지고 공개키를 생성
[root@moongk ~]#openssl rsa -in private.pem -out public.pem -outform PEM -pubout
writing RSA key

MySQL Replication 설정

MySQL 백업 및 분산처리를 위해 replication 설정이 필요한 경우 종종 발생하게 된다.
설정 방법은 그리 복잡하거나 어렵지 않으니 쉽게 적용 가능하다.

Master / slave 설정의 시작은 my.cnf 수정을 먼저하는데 id는 중복되어서는 않되며 Master는 CRUD를 모두 사용하며, slave 서버들은 읽기전용으로 사용 가능하다. my.cnf 파일을 열어 아래와 같이 수정한다.

# master
[mysqld]
server-id = 1
log-bin = mysql-bin
# log-bin은 지정하지 않고, log-bin 만 적어도 된다.
expire_log_days = 7
# bin 로그가 무한정 쌓이는 것을 방지하기 위해 위 옵션을 추가하는 것을 추천한다.

# slave1
[mysqld]
server-id = 2
read_only

# slave2
[mysqld]
server-id = 3
read_only

이제 Master 서버에 Replication에서 사용할 계정을 생성한다. ( 필자는 “moongk”로 생성)
~]# GRANT REPLICATION SLAVE ON *.* TO 'moongk'@'192.168.100.100' IDENTIFIED BY '비밀번호';
위와 같이 계정을 만들고 나면 시작 위치를 맞춰야 한다. 그러기 위해서는 Master DB 백업이 완료되기 전까지 DB접속을 차단하거나 LOCK걸어 두는 방법도 있다.
mysql> show master status\G
만약을 대비하여 File과 Position 값을 기록해 두는 것이 좋다.

이제 Slave 서버들 설정 진행할 차례로 Master 서버의 Dump를 복원한다. 복원한 후 Slave 서버들의 시작 위치를 설정한다.

mysql> CHANGE MASTER TO MASTER_HOST='192.168.100.1', MASTER_USER='moongk', MASTER_PASSWORD='비밀번호', MASTER_LOG_FILE='mysql-bin.000000', MASTER_LOG_POS=120;

mysql> start slave;
mysql> show slave status\G
 

CentOS7 SVN(Subversion) 설 치 및 설정

CentOS7 기준으로 yum 기본 repository에는 1.7.x 버전이 설치 된다.
특별한 이유가 없다면, 1.7를 그대로 설치하도록 하겠다. 설치 방법은 아래와 같이 아주 간단히 설치 할 수 있다.

~]# yum install subversion

이제 설치가 되었으면, 저장소를 만들고 실행할 차례로 /home/svn 아래 설치하도록 하겠다.

가장 먼저 svn저장소를 생성하고, svnserve 파일에 옵션을 변경(기본 디렉토리 지정)한 뒤, 저장소 생성을 진행하도록 하겠다.

~]# mkdir /home/svn
~]# vi /etc/sysconfig/svnserve
/etc/sysconfig/svnserve 파일 수정한 모습

다음으로, SVN 서비스를 외부에서 이용할 수 있도록 방화벽 정책을 추가한다.
firewall-cmd --permanent --zone=public --add-port=3690/tcp
firewall-cmd --reload
firewall-cmd --list-all

방화벽 정책을 추가하고, 다시 로드 후 최종 등록확인까지 진행한 모습

또 다른 확인 방법은,
ps -ef | grep svn : SVN의 프로세스가 동작하는 것을 확인 할 수 있으며,
netstat -anp | grep svnserve : SVN의 포트를 확인할 수 있다. ( 기본 포트 3690 )


이제 저장소를 생성하고, SVN를 실행할 차례이다.

~]# svnadmin create --fs-type fsfs moongk
~]# chmod -R g+ws moongk

이제 저장소 생성이 되었으나 바로 사용할 수 있는 상태는 아니다.
사용가능하도록 저장소 설정을 해야 하는데, 설정 내용은 저장소에 접근할 수 있는 권한을 관리하는 파일들이다.

가장 먼저 svnserve.conf 파일을 수정해야 하는데, 위치는 위에서 추가한 저장소 하위 /conf/svnserve.conf 에 위치하고 있다. 주요 수정사항은 아래와 같다.
19line 부근: anon-access = read => none
27line 부근: password-db = passwd (주석처리 되어있으면 주석 해제)
34line 부근: authz-db = authz (주석처리 되어 있으면 해제 / 생략 가능)

다음으로 27, 34 라인 부근에서 지정하였던 passwd, authz를 지정하면 되는데 해당 설정은 파일을 열어 보면 쉽게 이해할 수 있을 것이라 생각되어 생략하도록 하겠다.

저장소 확인 명령은 svn list svn://127.0.0.1/moongk 와 같은 방법으로 확인 할 수 있으며, 서비스의 시작과 종료는 아래와 같이 할 수 있다.

~]# systemctl start svnserve #실행
~]# systemctl stop svnserve #중지
~]# systemctl restart svnserve #재실행
~]# systemctl enable svnserve #부팅시 자동 실행 지정

MySQL5.6 character-set 변경

MySQL를 설치하면 latin으로 되어 있어 utf-8로 변경하고자 한다면 /etc/my.cnf 파일 아래와 같이 수정하고 데몬을 다시 구동하면 된다.

# For advice on how to change settings please see
# http://dev.mysql.com/doc/refman/5.6/en/server-configuration-defaults.html

[mysql]
default-character-set = utf8

[mysqld]
#
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
# innodb_buffer_pool_size = 128M
#
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
# log_bin
#
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M

character-set-client-handshake=FALSE
init_connect="SET collation_connection = utf8_general_ci"
init_connect="SET NAMES utf8"
character-set-server = utf8
collation-server = utf8_general_ci

datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock

# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0

# Recommended in standard MySQL setup
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES 

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

[client]
default-character-set = utf8

CentOS7 Redis5.x 설치 (yum)

최신 버전 redis 설치를 위해, REMI repository를 사용하도록 하겠다.

~]# yum install http://rpms.remirepo.net/enterprise/remi-release-7.rpm

위와 같이 repository를 등록하고 나면, yum을 통해 쉽게 redis 최신 버전 설치가 가능하다.

~]# yum --enablerepo=remi install redis
yum을 통해 설치된, redis 5.0.4

redis 접속을 외부에서 허용하고자 한다면, firewalld 서비스에 아래와 같이 해당포트(6379 : 기본 포트) 를 허용해주면 된다.

~]# sudo firewall-cmd --add-port=6379/tcp --permanenent
~]# sudo firewall-cmd --reload

CentOS7 MySQL5.6(5.7) 설치 (yum)

RHEL7 / CentOS 7 기준

MySQL 5.7
~]# rpm -ivh https://dev.mysql.com/get/mysql57-community-release-el7-11.noarch.rpm

MySQL 5.6
~]# rpm -ivh http://dev.mysql.com/get/mysql-community-release-el7-5.noarch.rpm

RHEL6 / CentOS6 기준

MySQL 5.7
~]# rpm -ivh https://dev.mysql.com/get/mysql57-community-release-el6-11.noarch.rpm

MySQL 5.6
~]# rpm -ivh http://dev.mysql.com/get/mysql-community-release-el6-5.noarch.rpm

OS와 버전에 맞춰 등록되었다면 yum을 통해 설치할 준비가 완료된 상태이다. 바로 설치해도 되지만 어떤 패키지가 있는지 확인 하도록 하자.

yum search mysql-community 를 통해 검색된 패키지 목록

이제 yum을 통해 mysql를 설치 후, 데몬을 구동하고 초기 root 비밀번호를 설정하면 된다.

~]# yum install mysql-community-server
~]# systemctl start mysqld
~]# mysql_secure_installation

CenOS는 초기 root 암호가 /root/.mysql_secret에 저장되는 경우가 있으니, 참고하기 바란다. ( grep ‘password’ /var/log/mysqld.log 를 통해 암호 확인 가능 )

CentOS7 php7.1(7.0, 7.2) 설치 (yum)

가장 손쉽게 설치할 수 있는 방법은 yum 통한 방법인 것 같아, yum을 통해 설치하는 방법을 소개하고자 한다. yum repository는 EPEL을 사용할 것이며, 7.0~7.2까지 설치 가능하다. 본 글에서는 7.1 설치 과정을 소개할 것이다.

“root”로 로그인되어 있지 않다면, root로 로그인 후 진행한다.
이후 업데이트를 통해 최신화 하는 것을 추천하나, 꼭 해야 할 필요는 없다.
(2행은 현재 OS버전 정보를 확인하기 위한 것으로 설치와는 무관)

~]# su -
~]# cat /etc/redhat-release
~]# yum list updates
~]# yum update -y 

이제 본격적으로 설치과정을 진행 하게 되는데, 자세한 설명보다는 명령어를 따라 실행하면 바로 설치될 수 있도록 필요한 과정을 최소화하여 소개하고자 한다.

~]# yum install epel-release yum-utils -y
~]# yum install http://rpms.remirepo.net/enterprise/remi-release-7.rpm
~]# yum-config-manager --enable remi-php71
~]# yum install php php-common php-opcache php-mcrypt php-cli php-gd php-curl php-mysql php-mbstring php-pdo
yum install epel-release yum-utils -y
yum install http://rpms.remirepo.net/enterprise/remi-release-7.rpm
yum-config-manager –enable remi-php71 (7.0 및 7.2는 가장 뒤 버전을 70,72로 변경하면 된다.)
yum install php php-common php-opcache php-mcrypt php-cli php-gd php-curl php-mysql php-mbstring php-pdo 를 통해 최종 설치 후 php 버전 확인