[빅데이터_Hive] HiveQL

select 테이블 명 from :: 데이터 조회

select  *  from  테이블 명         -- 전체(*) 데이터 조회
select  칼럼  from  테이블 명       -- 특정 데이터(칼럼) 조회

where :: 조건을 지정하여 데이터 조회

group by에서의 조건 지정은 having

as :: 별칭지정

Apple as a -- Apple을 a로 칭함

group by :: 취합

select 칼럼 from 테이블 group by 칼럼

distinct :: 중복 값 제거 (제일 위 값만 나타냄)

select distinct 칼럼 from 테이블

"partition"

:: 종류 (dynamic & stataic)

- dynamic partiton?

: 칼럼 정보를 이용하여 동적으로 파티션이 생성.

쿼리 시점에는 알 수 없음.

- static partition?

: 테이블에 데이터를 입력할 때 파티션 정보도 전달.

쿼리 시점에 파티션을 알 수 있음.

create table 테이블명 (컬럼명 data_type) partitioned by (컬럼명 data_type)  
--where 과 같은 조건문; 읽어들이는 데이터의 양을 줄여 처리속도 향상

"정렬"

:: order by (asc / desc)

select * from 테이블 order by 숫자   --전체 데이터를 정렬

:: sort by

select * from 테이블 sort by 숫자   --reduce내 데이터끼리 정렬

:: distribute by

select * from 테이블 distribute by 숫자   --같은 값의 row는 같은 reduce로 전달(정렬x)

:: cluster by

select * from 테이블 cluster by 숫자   --sort by 와 distribute by를 동시에 수행
                            	    --같은 값의 row는 같은 reduce로 전달 후, reduce 내를 정렬

daizy's_dream