편집 요약 없음 |
편집 요약 없음 |
||
| 1번째 줄: | 1번째 줄: | ||
= 성능 문제 식별 방법론과 튜닝 접근법 = | |||
== 1. 성능 문제 분석 방법론 == | |||
=== 1.1 Top-Down 접근법 (시스템 → 세션 → SQL) === | |||
**단계별 분석 순서:** | **단계별 분석 순서:** | ||
| 20번째 줄: | 20번째 줄: | ||
- Statspack (AWR 대체) | - Statspack (AWR 대체) | ||
=== 1.2 Bottom-Up 접근법 (SQL → 세션 → 시스템) === | |||
**적용 시나리오:** | **적용 시나리오:** | ||
| 32번째 줄: | 32번째 줄: | ||
- SQL Tuning Advisor | - SQL Tuning Advisor | ||
== 2. 성능 문제 분류 체계 == | |||
=== 2.1 문제 유형별 분류 === | |||
==== A. 응답 시간 문제 (Response Time Issues) ==== | |||
```sql | ```sql | ||
-- 응답 시간 분석을 위한 기본 쿼리 | -- 응답 시간 분석을 위한 기본 쿼리 | ||
SELECT | SELECT | ||
s.sid, s.serial | s.sid, s.serial=, s.username, s.machine, s.program, s | ||
s.status, s.last_call_et, | s.status, s.last_call_et, | ||
w.event, w.state, w.wait_time, w.seconds_in_wait, | w.event, w.state, w.wait_time, w.seconds_in_wait, | ||
sq.sql_text | sq.sql_text | ||
FROM v$session s, v$session_wait w, v$sqlarea sq | FROM v$session s, v$session_wait w, v$sqlarea sq | ||
WHERE s.sid = w.sid | WHERE s.sid = w.sid WHERE | ||
AND s.sql_address = sq.address(+) | AND s.sql_address = sq.address(+) AND | ||
AND s.sql_hash_value = sq.hash_value(+) | AND s.sql_hash_value = sq.hash_value(+) AND | ||
AND s.status = 'ACTIVE' | AND s.status = 'ACTIVE' AND | ||
ORDER BY s.last_call_et DESC; | ORDER BY s.last_call_et DESC; | ||
``` | ``` | ||
==== B. 처리량 문제 (Throughput Issues) ==== | |||
```sql | ```sql | ||
-- 시간당 트랜잭션 처리량 분석 | -- 시간당 트랜잭션 처리량 분석 | ||
| 58번째 줄: | 58번째 줄: | ||
TO_CHAR(begin_interval_time, 'YYYY-MM-DD HH24') as hour, | TO_CHAR(begin_interval_time, 'YYYY-MM-DD HH24') as hour, | ||
(SELECT value FROM dba_hist_sysstat dhss2 | (SELECT value FROM dba_hist_sysstat dhss2 | ||
WHERE dhss2.snap_id = dhs.snap_id | WHERE dhss2.snap_id = dhs.snap_id WHERE | ||
AND dhss2.stat_name = 'user commits') / | AND dhss2.stat_name = 'user commits') / AND | ||
(EXTRACT(HOUR FROM (end_interval_time - begin_interval_time)) * 3600 + | (EXTRACT(HOUR FROM (end_interval_time - begin_interval_time)) * 3600 + | ||
EXTRACT(MINUTE FROM (end_interval_time - begin_interval_time)) * 60) as tps | EXTRACT(MINUTE FROM (end_interval_time - begin_interval_time)) * 60) as tps | ||
FROM dba_hist_snapshot dhs | FROM dba_hist_snapshot dhs | ||
WHERE begin_interval_time >= SYSDATE - 7 | WHERE begin_interval_time >= SYSDATE - 7 WHERE | ||
ORDER BY begin_interval_time; | ORDER BY begin_interval_time; | ||
``` | ``` | ||
==== C. 리소스 고갈 문제 (Resource Exhaustion) ==== | |||
```sql | ```sql | ||
-- 메모리 사용률 분석 | -- 메모리 사용률 분석 | ||
| 80번째 줄: | 80번째 줄: | ||
``` | ``` | ||
=== 2.2 문제 심각도 분류 === | |||
==== 긴급도 매트릭스 ==== | |||
| 심각도 | 영향 범위 | 응답 시간 | 비즈니스 영향 | | | 심각도 | 영향 범위 | 응답 시간 | 비즈니스 영향 | | ||
|--------|-----------|-----------|---------------| | |--------|-----------|-----------|---------------| | ||
| 90번째 줄: | 90번째 줄: | ||
| P4 | 미미한 영향 | < 50% 증가 | 성능 개선 여지 | | | P4 | 미미한 영향 | < 50% 증가 | 성능 개선 여지 | | ||
== 3. Wait Event 기반 근본 원인 분석 == | |||
=== 3.1 Wait Event 분류 및 진단 === | |||
==== A. User I/O 관련 Wait Events ==== | |||
```sql | ```sql | ||
-- I/O 관련 대기 이벤트 상세 분석 | -- I/O 관련 대기 이벤트 상세 분석 | ||
| 105번째 줄: | 105번째 줄: | ||
time_waited/total_waits*1000 as avg_wait_ms | time_waited/total_waits*1000 as avg_wait_ms | ||
FROM v$system_event | FROM v$system_event | ||
WHERE wait_class = 'User I/O' | WHERE wait_class = 'User I/O' WHERE | ||
AND total_waits > 0 | AND total_waits > 0 | ||
ORDER BY time_waited DESC; | ORDER BY time_waited DESC; | ||
| 121번째 줄: | 121번째 줄: | ||
ELSE 0 END as avg_read_time_ms | ELSE 0 END as avg_read_time_ms | ||
FROM dba_data_files df, v$filestat fs | FROM dba_data_files df, v$filestat fs | ||
WHERE df.file_id = fs.file | WHERE df.file_id = fs.file= WHERE | ||
ORDER BY avg_read_time_ms DESC; | ORDER BY avg_read_time_ms DESC; | ||
``` | ``` | ||
| 131번째 줄: | 131번째 줄: | ||
- `log file sync`: 커밋 대기 | - `log file sync`: 커밋 대기 | ||
==== B. Concurrency 관련 Wait Events ==== | |||
```sql | ```sql | ||
-- Lock 경합 분석 | -- Lock 경합 분석 | ||
| 144번째 줄: | 144번째 줄: | ||
s2.seconds_in_wait | s2.seconds_in_wait | ||
FROM v$locked_object lo, dba_objects do, v$session s1, v$session s2 | FROM v$locked_object lo, dba_objects do, v$session s1, v$session s2 | ||
WHERE lo.object_id = do.object_id | WHERE lo.object_id = do.object_id WHERE | ||
AND lo.session_id = s1.sid | AND lo.session_id = s1.sid AND | ||
AND s2.blocking_session = s1.sid; | AND s2.blocking_session = s1.sid; AND | ||
``` | ``` | ||
| 155번째 줄: | 155번째 줄: | ||
- `library cache lock`: 공유 SQL 영역 경합 | - `library cache lock`: 공유 SQL 영역 경합 | ||
=== 3.2 Wait Event 패턴 분석 === | |||
==== 시간대별 Wait Event 트렌드 ==== | |||
```sql | ```sql | ||
-- 시간대별 주요 Wait Event 분석 (AWR 기반) | -- 시간대별 주요 Wait Event 분석 (AWR 기반) | ||
| 170번째 줄: | 170번째 줄: | ||
))/1000000, 2) as time_waited_sec | ))/1000000, 2) as time_waited_sec | ||
FROM dba_hist_system_event e, dba_hist_snapshot s | FROM dba_hist_system_event e, dba_hist_snapshot s | ||
WHERE e.snap_id = s.snap_id | WHERE e.snap_id = s.snap_id WHERE | ||
AND s.begin_interval_time >= SYSDATE - 1 | AND s.begin_interval_time >= SYSDATE - 1 AND | ||
AND e.wait_class != 'Idle' | AND e.wait_class != 'Idle' AND | ||
AND e.event_name IN ( | AND e.event_name IN ( | ||
'db file sequential read', | 'db file sequential read', | ||
| 182번째 줄: | 182번째 줄: | ||
``` | ``` | ||
== 4. 시스템 리소스 분석 == | |||
=== 4.1 CPU 사용률 분석 === | |||
==== OS Level CPU 분석 ==== | |||
```sql | ```sql | ||
-- CPU 사용률 히스토리 (AWR 기반) | -- CPU 사용률 히스토리 (AWR 기반) | ||
| 195번째 줄: | 195번째 줄: | ||
ROUND(os2.value/os1.value * 100, 2) as cpu_util_pct | ROUND(os2.value/os1.value * 100, 2) as cpu_util_pct | ||
FROM dba_hist_osstat os1, dba_hist_osstat os2, dba_hist_snapshot s | FROM dba_hist_osstat os1, dba_hist_osstat os2, dba_hist_snapshot s | ||
WHERE os1.snap_id = os2.snap_id | WHERE os1.snap_id = os2.snap_id WHERE | ||
AND os1.snap_id = s.snap_id | AND os1.snap_id = s.snap_id AND | ||
AND os1.stat_name = 'NUM_CPUS' | AND os1.stat_name = 'NUM_CPUS' AND | ||
AND os2.stat_name = 'LOAD' | AND os2.stat_name = 'LOAD' AND | ||
AND s.begin_interval_time >= SYSDATE - 1 | AND s.begin_interval_time >= SYSDATE - 1 AND | ||
ORDER BY time; | ORDER BY time; | ||
``` | ``` | ||
==== Oracle Process CPU 사용률 ==== | |||
```sql | ```sql | ||
-- 세션별 CPU 사용률 | -- 세션별 CPU 사용률 | ||
SELECT | SELECT | ||
s.sid, s.serial | s.sid, s.serial=, s.username, s.machine, s.program, s | ||
ss.value/100 as cpu_used_seconds, | ss.value/100 as cpu_used_seconds, | ||
s.last_call_et as seconds_since_last_call | s.last_call_et as seconds_since_last_call | ||
FROM v$session s, v$sesstat ss, v$statname sn | FROM v$session s, v$sesstat ss, v$statname sn | ||
WHERE s.sid = ss.sid | WHERE s.sid = ss.sid WHERE | ||
AND ss.statistic | AND ss.statistic= = sn.statistic= AND | ||
AND sn.name = 'CPU used by this session' | AND sn.name = 'CPU used by this session' AND | ||
AND ss.value > 0 | AND ss.value > 0 | ||
ORDER BY ss.value DESC; | ORDER BY ss.value DESC; | ||
``` | ``` | ||
=== 4.2 메모리 사용률 분석 === | |||
==== SGA 구성요소별 분석 ==== | |||
```sql | ```sql | ||
-- SGA 구성요소별 사용률 | -- SGA 구성요소별 사용률 | ||
| 245번째 줄: | 245번째 줄: | ||
``` | ``` | ||
==== PGA 메모리 분석 ==== | |||
```sql | ```sql | ||
-- 세션별 PGA 사용량 | -- 세션별 PGA 사용량 | ||
SELECT | SELECT | ||
s.sid, s.serial | s.sid, s.serial=, s.username, s.program, s | ||
ROUND(p.value/1024/1024, 2) as pga_used_mb, | ROUND(p.value/1024/1024, 2) as pga_used_mb, | ||
ROUND(p2.value/1024/1024, 2) as pga_alloc_mb, | ROUND(p2.value/1024/1024, 2) as pga_alloc_mb, | ||
| 255번째 줄: | 255번째 줄: | ||
FROM v$session s, v$process pr, v$sesstat p, v$sesstat p2, v$sesstat p3, | FROM v$session s, v$process pr, v$sesstat p, v$sesstat p2, v$sesstat p3, | ||
v$statname sn, v$statname sn2, v$statname sn3 | v$statname sn, v$statname sn2, v$statname sn3 | ||
WHERE s.paddr = pr.addr | WHERE s.paddr = pr.addr WHERE | ||
AND s.sid = p.sid AND p.statistic | AND s.sid = p.sid AND p.statistic= = sn.statistic= AND | ||
AND s.sid = p2.sid AND p2.statistic | AND s.sid = p2.sid AND p2.statistic= = sn2.statistic= AND | ||
AND s.sid = p3.sid AND p3.statistic | AND s.sid = p3.sid AND p3.statistic= = sn3.statistic= AND | ||
AND sn.name = 'session pga memory' | AND sn.name = 'session pga memory' AND | ||
AND sn2.name = 'session pga memory max' | AND sn2.name = 'session pga memory max' AND | ||
AND sn3.name = 'session uga memory max' | AND sn3.name = 'session uga memory max' AND | ||
AND p.value > 1024*1024 | AND p.value > 1024*1024 | ||
ORDER BY pga_used_mb DESC; | ORDER BY pga_used_mb DESC; | ||
``` | ``` | ||
== 5. SQL 레벨 근본 원인 분석 == | |||
=== 5.1 Top SQL 식별 및 분석 === | |||
==== 리소스 소비량 기준 Top SQL ==== | |||
```sql | ```sql | ||
-- CPU 시간 기준 Top SQL | -- CPU 시간 기준 Top SQL | ||
| 306번째 줄: | 306번째 줄: | ||
``` | ``` | ||
=== 5.2 실행 계획 분석 === | |||
==== 비효율적인 실행 계획 패턴 식별 ==== | |||
```sql | ```sql | ||
-- Full Table Scan이 발생하는 SQL 식별 | -- Full Table Scan이 발생하는 SQL 식별 | ||
| 318번째 줄: | 318번째 줄: | ||
SUBSTR(s.sql_text, 1, 100) as sql_preview | SUBSTR(s.sql_text, 1, 100) as sql_preview | ||
FROM v$sql_plan p, v$sql s | FROM v$sql_plan p, v$sql s | ||
WHERE p.sql_id = s.sql_id | WHERE p.sql_id = s.sql_id WHERE | ||
AND p.child_number = s.child_number | AND p.child_number = s.child_number AND | ||
AND p.operation = 'TABLE ACCESS' | AND p.operation = 'TABLE ACCESS' AND | ||
AND p.options = 'FULL' | AND p.options = 'FULL' AND | ||
AND s.executions > 10 | AND s.executions > 10 | ||
GROUP BY p.sql_id, s.executions, s.cpu_time, s.sql_text | GROUP BY p.sql_id, s.executions, s.cpu_time, s.sql_text | ||
| 341번째 줄: | 341번째 줄: | ||
``` | ``` | ||
== 6. 애플리케이션 레벨 분석 == | |||
=== 6.1 Connection Pool 분석 === | |||
```sql | ```sql | ||
-- 세션 상태별 분포 | -- 세션 상태별 분포 | ||
| 351번째 줄: | 351번째 줄: | ||
ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER(), 2) as percentage | ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER(), 2) as percentage | ||
FROM v$session | FROM v$session | ||
WHERE type = 'USER' | WHERE type = 'USER' WHERE | ||
GROUP BY status | GROUP BY status | ||
ORDER BY session_count DESC; | ORDER BY session_count DESC; | ||
| 357번째 줄: | 357번째 줄: | ||
-- 장시간 비활성 세션 식별 | -- 장시간 비활성 세션 식별 | ||
SELECT | SELECT | ||
sid, serial | sid, serial=, username, machine, program, sid | ||
status, ROUND(last_call_et/3600, 2) as hours_inactive, | status, ROUND(last_call_et/3600, 2) as hours_inactive, | ||
logon_time | logon_time | ||
FROM v$session | FROM v$session | ||
WHERE type = 'USER' | WHERE type = 'USER' WHERE | ||
AND status = 'INACTIVE' | AND status = 'INACTIVE' AND | ||
AND last_call_et > 3600 | AND last_call_et > 3600 | ||
ORDER BY last_call_et DESC; | ORDER BY last_call_et DESC; | ||
``` | ``` | ||
=== 6.2 트랜잭션 패턴 분석 === | |||
```sql | ```sql | ||
-- 장기 실행 트랜잭션 식별 | -- 장기 실행 트랜잭션 식별 | ||
SELECT | SELECT | ||
s.sid, s.serial | s.sid, s.serial=, s.username, s.machine, s | ||
t.start_time, | t.start_time, | ||
ROUND((SYSDATE - t.start_date) * 24 * 60, 2) as duration_minutes, | ROUND((SYSDATE - t.start_date) * 24 * 60, 2) as duration_minutes, | ||
| 377번째 줄: | 377번째 줄: | ||
r.name as rollback_segment | r.name as rollback_segment | ||
FROM v$session s, v$transaction t, v$rollname r | FROM v$session s, v$transaction t, v$rollname r | ||
WHERE s.taddr = t.addr | WHERE s.taddr = t.addr WHERE | ||
AND t.xidusn = r.usn | AND t.xidusn = r.usn AND | ||
AND (SYSDATE - t.start_date) * 24 * 60 > 30 | AND (SYSDATE - t.start_date) * 24 * 60 > 30 | ||
ORDER BY duration_minutes DESC; | ORDER BY duration_minutes DESC; | ||
``` | ``` | ||
== 7. 실전 진단 시나리오 == | |||
=== 시나리오 1: 갑작스러운 전체 시스템 성능 저하 === | |||
**진단 순서:** | **진단 순서:** | ||
| 393번째 줄: | 393번째 줄: | ||
SELECT event, count(*) as sessions | SELECT event, count(*) as sessions | ||
FROM v$session_wait | FROM v$session_wait | ||
WHERE wait_class != 'Idle' | WHERE wait_class != 'Idle' WHERE | ||
GROUP BY event | GROUP BY event | ||
ORDER BY sessions DESC; | ORDER BY sessions DESC; | ||
| 416번째 줄: | 416번째 줄: | ||
``` | ``` | ||
=== 시나리오 2: 특정 시간대 성능 저하 === | |||
**원인 분석 절차:** | **원인 분석 절차:** | ||
| 437번째 줄: | 437번째 줄: | ||
) as time_waited_delta_micro | ) as time_waited_delta_micro | ||
FROM dba_hist_system_event e, dba_hist_snapshot s | FROM dba_hist_system_event e, dba_hist_snapshot s | ||
WHERE e.snap_id = s.snap_id | WHERE e.snap_id = s.snap_id WHERE | ||
AND s.begin_interval_time BETWEEN | AND s.begin_interval_time BETWEEN | ||
TO_DATE('2024-01-15 14:00:00', 'YYYY-MM-DD HH24:MI:SS') AND | TO_DATE('2024-01-15 14:00:00', 'YYYY-MM-DD HH24:MI:SS') AND | ||
TO_DATE('2024-01-15 16:00:00', 'YYYY-MM-DD HH24:MI:SS') | TO_DATE('2024-01-15 16:00:00', 'YYYY-MM-DD HH24:MI:SS') | ||
AND e.wait_class != 'Idle' | AND e.wait_class != 'Idle' AND | ||
) | ) | ||
WHERE total_waits_delta > 0 | WHERE total_waits_delta > 0 | ||
| 447번째 줄: | 447번째 줄: | ||
``` | ``` | ||
== 8. 자동화된 근본 원인 분석 도구 == | |||
=== 8.1 ADDM 보고서 자동 생성 스크립트 === | |||
```sql | ```sql | ||
-- ADDM 보고서 생성 및 분석 | -- ADDM 보고서 생성 및 분석 | ||
DECLARE | DECLARE | ||
task_name VARCHAR2(30) := 'ADDM_ANALYSIS_' || TO_CHAR(SYSDATE, 'YYYYMMDDHH24MI'); | task_name VARCHAR2(30) := 'ADDM_ANALYSIS_' || TO_CHAR(SYSDATE, 'YYYYMMDDHH24MI'); task_name | ||
begin_snap NUMBER; | begin_snap NUMBER; | ||
end_snap NUMBER; | end_snap NUMBER; | ||
| 462번째 줄: | 462번째 줄: | ||
DBMS_ADDM.ANALYZE_DB( | DBMS_ADDM.ANALYZE_DB( | ||
task_name => task_name, | task_name => task_name, task_name | ||
begin_snapshot => begin_snap, | begin_snapshot => begin_snap, begin_snapshot | ||
end_snapshot => end_snap | end_snapshot => end_snap end_snapshot | ||
); | ); | ||
| 471번째 줄: | 471번째 줄: | ||
SELECT finding_name, impact, message | SELECT finding_name, impact, message | ||
FROM dba_addm_findings | FROM dba_addm_findings | ||
WHERE task_name = task_name | WHERE task_name = task_name WHERE | ||
ORDER BY impact DESC | ORDER BY impact DESC | ||
) LOOP | ) LOOP | ||
| 483번째 줄: | 483번째 줄: | ||
``` | ``` | ||
=== 8.2 Python 기반 성능 모니터링 자동화 === | |||
```python | ```python | ||
= Oracle 성능 모니터링 자동화 스크립트 예시 = | |||
import cx_Oracle | import cx_Oracle | ||
import pandas as pd | import pandas as pd | ||
| 493번째 줄: | 493번째 줄: | ||
class OraclePerformanceMonitor: | class OraclePerformanceMonitor: | ||
def __init__(self, connection_string): | def __init__(self, connection_string): | ||
self.connection = cx_Oracle.connect(connection_string) | self.connection = cx_Oracle.connect(connection_string) self | ||
def get_top_wait_events(self, hours_back=1): | def get_top_wait_events(self, hours_back=1): def | ||
query = """ | query = """ query | ||
SELECT event, total_waits, time_waited, average_wait | SELECT event, total_waits, time_waited, average_wait | ||
FROM v$system_event | FROM v$system_event | ||
WHERE wait_class != 'Idle' | WHERE wait_class != 'Idle' WHERE | ||
AND total_waits > 0 | AND total_waits > 0 | ||
ORDER BY time_waited DESC | ORDER BY time_waited DESC | ||
| 507번째 줄: | 507번째 줄: | ||
def analyze_performance_degradation(self): | def analyze_performance_degradation(self): | ||
= CPU 사용률 체크 = | |||
cpu_query = """ | cpu_query = """ cpu_query | ||
SELECT value FROM v$sysmetric | SELECT value FROM v$sysmetric | ||
WHERE metric_name = 'Host CPU Utilization (%)' | WHERE metric_name = 'Host CPU Utilization (%)' WHERE | ||
AND intsize_csec = 6000 | AND intsize_csec = 6000 AND | ||
""" | """ | ||
cpu_usage = pd.read_sql(cpu_query, self.connection) | cpu_usage = pd.read_sql(cpu_query, self.connection) cpu_usage | ||
= 임계치 초과 시 알림 = | |||
if cpu_usage['VALUE'].iloc[0] > 80: | if cpu_usage['VALUE'].iloc[0] > 80: | ||
return "HIGH_CPU_ALERT" | return "HIGH_CPU_ALERT" | ||
2025년 9월 16일 (화) 10:55 판
성능 문제 식별 방법론과 튜닝 접근법
1. 성능 문제 분석 방법론
1.1 Top-Down 접근법 (시스템 → 세션 → SQL)
- 단계별 분석 순서:**
``` 시스템 전체 성능 → 인스턴스 레벨 → 세션 레벨 → SQL 레벨 → 오브젝트 레벨 ```
- 장점:**
- 전체적인 시스템 상황 파악 가능 - 우선순위가 높은 문제부터 식별 - 리소스 사용률 기반의 객관적 분석
- 분석 도구:**
- ADDM (Automatic Database Diagnostic Monitor) - AWR (Automatic Workload Repository) - Statspack (AWR 대체)
1.2 Bottom-Up 접근법 (SQL → 세션 → 시스템)
- 적용 시나리오:**
- 특정 애플리케이션의 성능 문제 - 특정 SQL 문장의 성능 저하 - 사용자 불만 기반 문제 해결
- 분석 도구:**
- SQL Trace (10046 Event) - ASH (Active Session History) - SQL Tuning Advisor
2. 성능 문제 분류 체계
2.1 문제 유형별 분류
A. 응답 시간 문제 (Response Time Issues)
```sql -- 응답 시간 분석을 위한 기본 쿼리 SELECT
s.sid, s.serial=, s.username, s.machine, s.program, s s.status, s.last_call_et, w.event, w.state, w.wait_time, w.seconds_in_wait, sq.sql_text
FROM v$session s, v$session_wait w, v$sqlarea sq WHERE s.sid = w.sid WHERE
AND s.sql_address = sq.address(+) AND AND s.sql_hash_value = sq.hash_value(+) AND AND s.status = 'ACTIVE' AND
ORDER BY s.last_call_et DESC; ```
B. 처리량 문제 (Throughput Issues)
```sql -- 시간당 트랜잭션 처리량 분석 SELECT
TO_CHAR(begin_interval_time, 'YYYY-MM-DD HH24') as hour,
(SELECT value FROM dba_hist_sysstat dhss2
WHERE dhss2.snap_id = dhs.snap_id WHERE
AND dhss2.stat_name = 'user commits') / AND
(EXTRACT(HOUR FROM (end_interval_time - begin_interval_time)) * 3600 +
EXTRACT(MINUTE FROM (end_interval_time - begin_interval_time)) * 60) as tps
FROM dba_hist_snapshot dhs WHERE begin_interval_time >= SYSDATE - 7 WHERE ORDER BY begin_interval_time; ```
C. 리소스 고갈 문제 (Resource Exhaustion)
```sql -- 메모리 사용률 분석 SELECT
component, current_size/1024/1024 as current_mb, max_size/1024/1024 as max_mb, (current_size/max_size)*100 as usage_pct
FROM v$memory_dynamic_components WHERE max_size > 0 ORDER BY usage_pct DESC; ```
2.2 문제 심각도 분류
긴급도 매트릭스
| 심각도 | 영향 범위 | 응답 시간 | 비즈니스 영향 | |--------|-----------|-----------|---------------| | P1 | 전체 시스템 | > 300% 증가 | 서비스 중단 | | P2 | 특정 모듈 | 100-300% 증가 | 기능 제한 | | P3 | 개별 기능 | 50-100% 증가 | 사용자 불편 | | P4 | 미미한 영향 | < 50% 증가 | 성능 개선 여지 |
3. Wait Event 기반 근본 원인 분석
3.1 Wait Event 분류 및 진단
A. User I/O 관련 Wait Events
```sql -- I/O 관련 대기 이벤트 상세 분석 SELECT
event, total_waits, total_timeouts, time_waited, average_wait, time_waited/total_waits*1000 as avg_wait_ms
FROM v$system_event WHERE wait_class = 'User I/O' WHERE
AND total_waits > 0
ORDER BY time_waited DESC;
-- 파일별 I/O 통계 SELECT
df.tablespace_name,
df.file_name,
fs.phyblkrd as blocks_read,
fs.phyblkwrt as blocks_written,
fs.readtim as read_time,
fs.writetim as write_time,
CASE WHEN fs.phyblkrd > 0
THEN fs.readtim/fs.phyblkrd*10
ELSE 0 END as avg_read_time_ms
FROM dba_data_files df, v$filestat fs WHERE df.file_id = fs.file= WHERE ORDER BY avg_read_time_ms DESC; ```
- 주요 I/O Wait Events:**
- `db file sequential read`: 인덱스 스캔, ROWID 액세스 - `db file scattered read`: Full Table Scan - `direct path read`: 병렬 처리, LOB 액세스 - `log file sync`: 커밋 대기
B. Concurrency 관련 Wait Events
```sql -- Lock 경합 분석 SELECT
s1.sid as blocker_sid, s1.username as blocker_user, s2.sid as waiter_sid, s2.username as waiter_user, lo.object_id, do.object_name, lo.locked_mode, s2.seconds_in_wait
FROM v$locked_object lo, dba_objects do, v$session s1, v$session s2 WHERE lo.object_id = do.object_id WHERE
AND lo.session_id = s1.sid AND AND s2.blocking_session = s1.sid; AND
```
- 주요 Concurrency Wait Events:**
- `enq: TX - row lock contention`: 행 레벨 lock 경합 - `enq: TM - contention`: 테이블 lock 경합 - `latch: cache buffers chains`: Buffer Cache 경합 - `library cache lock`: 공유 SQL 영역 경합
3.2 Wait Event 패턴 분석
시간대별 Wait Event 트렌드
```sql -- 시간대별 주요 Wait Event 분석 (AWR 기반) SELECT
TO_CHAR(s.begin_interval_time, 'YYYY-MM-DD HH24:MI') as snapshot_time,
e.event_name,
e.total_waits - LAG(e.total_waits) OVER (
PARTITION BY e.event_name ORDER BY s.snap_id
) as waits_delta,
ROUND((e.time_waited_micro - LAG(e.time_waited_micro) OVER (
PARTITION BY e.event_name ORDER BY s.snap_id
))/1000000, 2) as time_waited_sec
FROM dba_hist_system_event e, dba_hist_snapshot s WHERE e.snap_id = s.snap_id WHERE
AND s.begin_interval_time >= SYSDATE - 1 AND
AND e.wait_class != 'Idle' AND
AND e.event_name IN (
'db file sequential read',
'db file scattered read',
'log file sync',
'enq: TX - row lock contention'
)
ORDER BY snapshot_time, time_waited_sec DESC; ```
4. 시스템 리소스 분석
4.1 CPU 사용률 분석
OS Level CPU 분석
```sql -- CPU 사용률 히스토리 (AWR 기반) SELECT
TO_CHAR(s.begin_interval_time, 'YYYY-MM-DD HH24:MI') as time, os1.value as num_cpus, os2.value as load_avg, ROUND(os2.value/os1.value * 100, 2) as cpu_util_pct
FROM dba_hist_osstat os1, dba_hist_osstat os2, dba_hist_snapshot s WHERE os1.snap_id = os2.snap_id WHERE
AND os1.snap_id = s.snap_id AND AND os1.stat_name = 'NUM_CPUS' AND AND os2.stat_name = 'LOAD' AND AND s.begin_interval_time >= SYSDATE - 1 AND
ORDER BY time; ```
Oracle Process CPU 사용률
```sql -- 세션별 CPU 사용률 SELECT
s.sid, s.serial=, s.username, s.machine, s.program, s ss.value/100 as cpu_used_seconds, s.last_call_et as seconds_since_last_call
FROM v$session s, v$sesstat ss, v$statname sn WHERE s.sid = ss.sid WHERE
AND ss.statistic= = sn.statistic= AND AND sn.name = 'CPU used by this session' AND AND ss.value > 0
ORDER BY ss.value DESC; ```
4.2 메모리 사용률 분석
SGA 구성요소별 분석
```sql -- SGA 구성요소별 사용률 SELECT
component, current_size/1024/1024 as current_mb, min_size/1024/1024 as min_mb, max_size/1024/1024 as max_mb, ROUND((current_size/max_size)*100, 2) as usage_pct, resize_ops, last_oper_type
FROM v$memory_dynamic_components WHERE current_size > 0 ORDER BY current_mb DESC;
-- Buffer Cache Hit Ratio 상세 분석 SELECT
name, physical_reads, db_block_gets, consistent_gets, ROUND((1 - (physical_reads / (db_block_gets + consistent_gets))) * 100, 2) as hit_ratio
FROM v$buffer_pool_statistics; ```
PGA 메모리 분석
```sql -- 세션별 PGA 사용량 SELECT
s.sid, s.serial=, s.username, s.program, s ROUND(p.value/1024/1024, 2) as pga_used_mb, ROUND(p2.value/1024/1024, 2) as pga_alloc_mb, ROUND(p3.value/1024/1024, 2) as pga_max_mb
FROM v$session s, v$process pr, v$sesstat p, v$sesstat p2, v$sesstat p3,
v$statname sn, v$statname sn2, v$statname sn3
WHERE s.paddr = pr.addr WHERE
AND s.sid = p.sid AND p.statistic= = sn.statistic= AND AND s.sid = p2.sid AND p2.statistic= = sn2.statistic= AND AND s.sid = p3.sid AND p3.statistic= = sn3.statistic= AND AND sn.name = 'session pga memory' AND AND sn2.name = 'session pga memory max' AND AND sn3.name = 'session uga memory max' AND AND p.value > 1024*1024
ORDER BY pga_used_mb DESC; ```
5. SQL 레벨 근본 원인 분석
5.1 Top SQL 식별 및 분석
리소스 소비량 기준 Top SQL
```sql -- CPU 시간 기준 Top SQL SELECT
sql_id, child_number, executions, ROUND(cpu_time/1000000, 2) as cpu_time_sec, ROUND(elapsed_time/1000000, 2) as elapsed_time_sec, ROUND(cpu_time/executions/1000, 2) as avg_cpu_ms, disk_reads, buffer_gets, rows_processed, SUBSTR(sql_text, 1, 100) as sql_text_preview
FROM v$sql WHERE executions > 0
AND cpu_time > 0
ORDER BY cpu_time DESC FETCH FIRST 20 ROWS ONLY;
-- Logical I/O 기준 Top SQL SELECT
sql_id, executions, buffer_gets, ROUND(buffer_gets/executions, 2) as avg_lio_per_exec, disk_reads, ROUND(disk_reads/executions, 2) as avg_pio_per_exec, rows_processed, SUBSTR(sql_text, 1, 100) as sql_text_preview
FROM v$sql WHERE executions > 0 ORDER BY buffer_gets DESC FETCH FIRST 20 ROWS ONLY; ```
5.2 실행 계획 분석
비효율적인 실행 계획 패턴 식별
```sql -- Full Table Scan이 발생하는 SQL 식별 SELECT DISTINCT
p.sql_id, COUNT(*) as fts_operations, s.executions, s.cpu_time/1000000 as cpu_sec, SUBSTR(s.sql_text, 1, 100) as sql_preview
FROM v$sql_plan p, v$sql s WHERE p.sql_id = s.sql_id WHERE
AND p.child_number = s.child_number AND AND p.operation = 'TABLE ACCESS' AND AND p.options = 'FULL' AND AND s.executions > 10
GROUP BY p.sql_id, s.executions, s.cpu_time, s.sql_text ORDER BY cpu_sec DESC;
-- 카디널리티 추정 오류 식별 SELECT
sql_id, child_number, operation, object_name,
cardinality as estimated_rows,
last_cr_buffer_gets as actual_buffer_gets,
CASE WHEN cardinality > 0
THEN ROUND(last_cr_buffer_gets/cardinality, 2)
ELSE 0 END as actual_vs_estimated_ratio
FROM v$sql_plan_statistics_all WHERE last_cr_buffer_gets > 0
AND cardinality > 0 AND ABS(LOG(10, last_cr_buffer_gets/cardinality)) > 1
ORDER BY actual_vs_estimated_ratio DESC; ```
6. 애플리케이션 레벨 분석
6.1 Connection Pool 분석
```sql -- 세션 상태별 분포 SELECT
status, COUNT(*) as session_count, ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER(), 2) as percentage
FROM v$session WHERE type = 'USER' WHERE GROUP BY status ORDER BY session_count DESC;
-- 장시간 비활성 세션 식별 SELECT
sid, serial=, username, machine, program, sid status, ROUND(last_call_et/3600, 2) as hours_inactive, logon_time
FROM v$session WHERE type = 'USER' WHERE
AND status = 'INACTIVE' AND AND last_call_et > 3600
ORDER BY last_call_et DESC; ```
6.2 트랜잭션 패턴 분석
```sql -- 장기 실행 트랜잭션 식별 SELECT
s.sid, s.serial=, s.username, s.machine, s t.start_time, ROUND((SYSDATE - t.start_date) * 24 * 60, 2) as duration_minutes, t.used_ublk as undo_blocks_used, r.name as rollback_segment
FROM v$session s, v$transaction t, v$rollname r WHERE s.taddr = t.addr WHERE
AND t.xidusn = r.usn AND AND (SYSDATE - t.start_date) * 24 * 60 > 30
ORDER BY duration_minutes DESC; ```
7. 실전 진단 시나리오
시나리오 1: 갑작스러운 전체 시스템 성능 저하
- 진단 순서:**
1. **즉시 확인 사항**
```sql -- 현재 활성 세션 및 대기 이벤트 SELECT event, count(*) as sessions FROM v$session_wait WHERE wait_class != 'Idle' WHERE GROUP BY event ORDER BY sessions DESC; ```
2. **시스템 리소스 확인**
```sql
-- CPU 및 메모리 사용률 확인
SELECT metric_name, value, metric_unit
FROM v$sysmetric
WHERE metric_name IN (
'Host CPU Utilization (%)',
'Current OS Load',
'Memory Usage (%)'
);
```
3. **AWR 스냅샷 생성 및 분석**
```sql -- 수동 AWR 스냅샷 생성 EXEC DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT(); ```
시나리오 2: 특정 시간대 성능 저하
- 원인 분석 절차:**
```sql -- 문제 시간대의 Top Wait Events SELECT
event_name, wait_class, total_waits_delta, ROUND(time_waited_delta_micro/1000000, 2) as time_waited_sec
FROM (
SELECT
e.event_name,
e.wait_class,
e.total_waits - LAG(e.total_waits) OVER (
PARTITION BY e.event_name ORDER BY e.snap_id
) as total_waits_delta,
e.time_waited_micro - LAG(e.time_waited_micro) OVER (
PARTITION BY e.event_name ORDER BY e.snap_id
) as time_waited_delta_micro
FROM dba_hist_system_event e, dba_hist_snapshot s
WHERE e.snap_id = s.snap_id WHERE
AND s.begin_interval_time BETWEEN
TO_DATE('2024-01-15 14:00:00', 'YYYY-MM-DD HH24:MI:SS') AND
TO_DATE('2024-01-15 16:00:00', 'YYYY-MM-DD HH24:MI:SS')
AND e.wait_class != 'Idle' AND
) WHERE total_waits_delta > 0 ORDER BY time_waited_sec DESC; ```
8. 자동화된 근본 원인 분석 도구
8.1 ADDM 보고서 자동 생성 스크립트
```sql -- ADDM 보고서 생성 및 분석 DECLARE
task_name VARCHAR2(30) := 'ADDM_ANALYSIS_' || TO_CHAR(SYSDATE, 'YYYYMMDDHH24MI'); task_name begin_snap NUMBER; end_snap NUMBER;
BEGIN
SELECT MAX(snap_id) - 1, MAX(snap_id)
INTO begin_snap, end_snap
FROM dba_hist_snapshot;
DBMS_ADDM.ANALYZE_DB(
task_name => task_name, task_name
begin_snapshot => begin_snap, begin_snapshot
end_snapshot => end_snap end_snapshot
);
-- 권고사항 조회
FOR rec IN (
SELECT finding_name, impact, message
FROM dba_addm_findings
WHERE task_name = task_name WHERE
ORDER BY impact DESC
) LOOP
DBMS_OUTPUT.PUT_LINE('Finding: ' || rec.finding_name);
DBMS_OUTPUT.PUT_LINE('Impact: ' || rec.impact);
DBMS_OUTPUT.PUT_LINE('Message: ' || rec.message);
DBMS_OUTPUT.PUT_LINE('---');
END LOOP;
END; / ```
8.2 Python 기반 성능 모니터링 자동화
```python
Oracle 성능 모니터링 자동화 스크립트 예시
import cx_Oracle import pandas as pd import numpy as np from datetime import datetime, timedelta
class OraclePerformanceMonitor:
def __init__(self, connection_string):
self.connection = cx_Oracle.connect(connection_string) self
def get_top_wait_events(self, hours_back=1): def
query = """ query
SELECT event, total_waits, time_waited, average_wait
FROM v$system_event
WHERE wait_class != 'Idle' WHERE
AND total_waits > 0
ORDER BY time_waited DESC
FETCH FIRST 10 ROWS ONLY
"""
return pd.read_sql(query, self.connection)
def analyze_performance_degradation(self):
= CPU 사용률 체크 =
cpu_query = """ cpu_query
SELECT value FROM v$sysmetric
WHERE metric_name = 'Host CPU Utilization (%)' WHERE
AND intsize_csec = 6000 AND
"""
cpu_usage = pd.read_sql(cpu_query, self.connection) cpu_usage
= 임계치 초과 시 알림 =
if cpu_usage['VALUE'].iloc[0] > 80:
return "HIGH_CPU_ALERT"
return "NORMAL"
```
이러한 체계적인 접근법을 통해 Oracle Database의 성능 문제를 효과적으로 진단하고 근본 원인을 파악할 수 있습니다. 각 단계별로 구체적인 SQL과 분석 방법을 제시했으니, 실제 운영 환경에서 활용하시기 바랍니다.