Linux 이야기.

리눅스 정기점검 스크립트 2013.05.30
서버가 응답이 없어요..INfo:task <process>:<pid> lock for more than 120 seconds" 2013.04.04
우분투의 apt-get (Advanced Packaging Tool) 을 이용해 보자. 2013.04.03
Ubuntu Static Network 설정과 온라인 패키지 검색 2013.04.01
튜닝을 위한 리눅스 커널 파라메터들 2013.03.29
리눅스 3.7 커널 릴리즈: TCP Fast Open, vxLan 지원 2012.12.28
재부팅없이 SCSI를 인식시키는 방법 2012.12.14
SCSI 정보를 알아보자 2012.12.11
Linux System Hangup을 방지하라. 2012.11.26
패킷 Overrun 으로 인한 Frame loss 2012.11.23

리눅스 정기점검 스크립트

2013. 5. 30. 03:00

리눅스 정기점검 스크립트..

다른 사람들은 어떻게 구성했는지 많은 소스 참고하면 좀 더 좋아지겠지..

'Linux 이야기. > 유용한 쉘스크립트' 카테고리의 다른 글

System V를 이용한 솔루션 구동 스크립트 (0)	2012.03.27
Remove All FCP Sysfs. (0)	2011.08.29
zLinux FCP SCSI Remove Script (0)	2011.08.29
Linux FileSystem Mount 체크 (0)	2011.03.30

서버가 응답이 없어요..INfo:task <process>:<pid> lock for more than 120 seconds"

2013. 4. 4. 11:17

리눅스 시스템을 운영하다 보면 시스템 Hang-up에 대한 문제점을 많이 들어나게 됩니다. 물론 리눅스 시스템
자체의 문제점이라고 보다는 특정 운영 프로세스에 대해서 인터럽트 또는 스케쥴이 정상적으로 진행되지 못했을때 나타는 문제점에 대해서 간략해 소개해 보고자 합니다.

갑자기 시스템 로그에서 출력하게된 /var/log/message 의 시스템 로그.

kernel: INFO: task startup.sh:9902 blocked for more than 120 seconds.
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this messag
kernel: startup.sh D 0000000000000008 0 9902 7811 0x00000004
kernel: ffff880828f17ce8 0000000000000082 ffff88081c2aeaa0 ffff88081c2aeaa0
kernel: ffff88081c2aeaa0 ffffea00378c0d98 ffff88081c2aeaa0 0000010100001e60
kernel: ffff88081c2af058 ffff880828f17fd8 000000000000fb88 ffff88081c2af058
kernel: Call Trace:
kernel: [<ffffffffa00accf0>] ? ext4_file_open+0x0/0x130 [ext4]
kernel: [<ffffffff814eae85>] schedule_timeout+0x215/0x2e0
kernel: [<ffffffff81174234>] ? nameidata_to_filp+0x54/0x70
kernel: [<ffffffff81268d39>] ? cpumask_next_and+0x29/0x50
kernel: [<ffffffff814eab03>] wait_for_common+0x123/0x180
kernel: [<ffffffff8105fa40>] ? default_wake_function+0x0/0x20
kernel: [<ffffffff814eac1d>] wait_for_completion+0x1d/0x20
kernel: [<ffffffff8106155c>] sched_exec+0xdc/0xe0
kernel: [<ffffffff8117ee90>] do_execve+0xe0/0x2c0
kernel: [<ffffffff810095ea>] sys_execve+0x4a/0x80
kernel: [<ffffffff8100b4ca>] stub_execve+0x6a/0xc0

[예상 문제점 1]
시스템 로그에서 나타난 kernel:INFO: task 의 메세지를 나타내는 의미는 현재 운영하고자 하는 startup.sh
스크립트에 대해서 120초 (2분 default) 동안 khungtaskd 쓰레드에서 D-state 상태를 감지하여 call trace
를 호출하게 되는 상황으로 예측할 수 있습니다.

[예상 문제점2]
시스템의 성능저하 특히 레드햇의 보고서에 의하면 디스크의 heavy I/O 로 나타나는 문제점으로
예측될수 있습니다.

[예상 문제점 2]
운영되고 있는 Application에 대해서 "D-state" (Uninterruptible sleep) 모드가 120초 동안 지속되었을때
예측될수 있는 문제로 해당 프로세스가 정상적으로 스케쥴링이 일어나지 않았을 때도 나타나게 됩니다

[ D-state 원인분석을 위해서는 ? ]
위와같은 문제점에 대해서 접근할수 있는 기본적인 방법에 대해서는 정확한 문제점을 파악하기 위해서는 커널의 덤프를 이용하여 core를 분석하는 것이 가장 정확할수 있습니다. core를 분석하기 위한 방법론에 대해서는 다음
블로깅을 통하여 소개하도록 하겠습니다.

# echo 1 > /proc/sys/kernel/hung_task_panic

위와 같이 kernel core를 생성하기 위해 설정을 진행한 후 동일한 현생이 발생되어 core가 생성이 되면 문제가
되는 운영 프로세스의 D-state 문점에 대해서 분석이 가능합니다.

[ hung_task_timeout을 Disable 시켜라]
기본적인 Work-Arround에 대해서는 아래와 같이 hung_task_timeout 부분에 대해서 Disable 시켜주는 것을 권장하고 있습니다.

# echo 0 > /proc/sys/kernel/hung_task_timeout_secs

위와 같이 커널에서 Hang Time OUt 을 체크하는 부분에 대해서 Disable 시켜준 후에는 call trace 여부가 지속적으로 발생되는지 모니터링이 필요합니다. 만약 해당 프로세스에 대해서 Uninterrupt sleep이 지속적으로 발생한다면 커널 업데이트 또한 고려해야 합니다.

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

ntp설정 (0)	2013.11.22
Partition “X” does not end on cylinder boundary (0)	2013.10.01
튜닝을 위한 리눅스 커널 파라메터들 (0)	2013.03.29
리눅스 3.7 커널 릴리즈: TCP Fast Open, vxLan 지원 (0)	2012.12.28
재부팅없이 SCSI를 인식시키는 방법 (0)	2012.12.14

우분투의 apt-get (Advanced Packaging Tool) 을 이용해 보자.

2013. 4. 3. 16:43

일반적으로 레드햇 계열 (RedHat, CentOS, SUSE Linux)에서 패키지 관리에 사용되는 강력한 기능중에 YUM
기능은 패키지 의존성 (Dependency)를 고려해서 안전하게 패키지를 설치 구성이 가능한데 , 데비안 계열의
리눅스의 경우는 apt-get (Advanced Packaging Tool)이라고 하는 아주 쓸만한 놈이 있네요 다음과 같이 레드햇 계열의 리눅스에서 사용하고 있는 YUM과 비교해서 정리해 봅니다.

1. deb 패키지 정보에 대한 인덱스 저장소

apt-get은 패키지들에 대한 인덱스 정보를 기반으로 패키지 리스트를 가져오는데 yum의 repository와 유사한것으로 판단해도 무리는 없을듯 합니다.

경로는 : /etc/apt/sources.list

#deb cdrom:[Ubuntu 12.10 _Quantal Quetzal_ - Release amd64 (20121017.5)]/ quantal main restricted

# See http://help.ubuntu.com/community/UpgradeNotes for how to upgrade to
# newer versions of the distribution.
deb http://kr.archive.ubuntu.com/ubuntu/ quantal main restricted
deb-src http://kr.archive.ubuntu.com/ubuntu/ quantal main restricted

## Major bug fix updates produced after the final release of the
## distribution.
deb http://kr.archive.ubuntu.com/ubuntu/ quantal-updates main restricted
deb-src http://kr.archive.ubuntu.com/ubuntu/ quantal-updates main restricted

## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu
## team. Also, please note that software in universe WILL NOT receive any
## review or updates from the Ubuntu security team.
deb http://kr.archive.ubuntu.com/ubuntu/ quantal universe
deb-src http://kr.archive.ubuntu.com/ubuntu/ quantal universe
deb http://kr.archive.ubuntu.com/ubuntu/ quantal-updates universe
deb-src http://kr.archive.ubuntu.com/ubuntu/ quantal-updates universe

## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu
## team, and may not be under a free licence. Please satisfy yourself as to
## your rights to use the software. Also, please note that software in
## multiverse WILL NOT receive any review or updates from the Ubuntu
## security team.
deb http://kr.archive.ubuntu.com/ubuntu/ quantal multiverse
deb-src http://kr.archive.ubuntu.com/ubuntu/ quantal multiverse
deb http://kr.archive.ubuntu.com/ubuntu/ quantal-updates multiverse
deb-src http://kr.archive.ubuntu.com/ubuntu/ quantal-updates multiverse

## N.B. software from this repository may not have been tested as
## extensively as that contained in the main release, although it includes
## newer versions of some applications which may provide useful features.
## Also, please note that software in backports WILL NOT receive any review
## or updates from the Ubuntu security team.
deb http://kr.archive.ubuntu.com/ubuntu/ quantal-backports main restricted universe multiverse
deb-src http://kr.archive.ubuntu.com/ubuntu/ quantal-backports main restricted universe multiverse

deb http://security.ubuntu.com/ubuntu quantal-security main restricted
deb-src http://security.ubuntu.com/ubuntu quantal-security main restricted
deb http://security.ubuntu.com/ubuntu quantal-security universe
deb-src http://security.ubuntu.com/ubuntu quantal-security universe
deb http://security.ubuntu.com/ubuntu quantal-security multiverse
deb-src http://security.ubuntu.com/ubuntu quantal-security multiverse

## Uncomment the following two lines to add software from Canonical's
## 'partner' repository.
## This software is not part of Ubuntu, but is offered by Canonical and the
## respective vendors as a service to Ubuntu users.
# deb http://archive.canonical.com/ubuntu quantal partner
# deb-src http://archive.canonical.com/ubuntu quantal partner

## This software is not part of Ubuntu, but is offered by third-party
## developers who want to ship their latest software.
deb http://extras.ubuntu.com/ubuntu quantal main
deb-src http://extras.ubuntu.com/ubuntu quantal main

2. apt-get 을 이용한 패키지 설치 하기

이제 본격적으로 우분투에서 제공하는 apt-get을 이용한 패키지 설치는 진행해 봅니다. 아래와 같은 단계로 필요한 부분에 대해서 설치를 적절하게 진행 해보는 것도 도움이 됩니다.

2.1 패키지 설치와 업데이트

sudo apt-get install [패키지 명] : 설치하고자 하는 패키지를 검색하여 설치를 진행
sudo apt-get upgrade : 패지키들에 대한 업그레이드 진행
sudo apt-get update : 전체 패키지를 검색하여 업데이트 대상의 패키지들을 자동 업데이트
sudo apt-get dist-upgrade : 패키지 업그레이드시 우선 의존성을 검증한후 설치를 진행
sudo apt-get --reinstall [패키지명] : 재설치 하고자 하는 패키지를 설치

2.2 패키지 검색 및 삭제

sudo apt-cache search [Package Name] : 원하는 패키지를 검색하고자 할때 사용
sudo apt-get remove [Package Name] : 원하는 패키지를 삭제하고자 할때 사용
sudo apt-get source [Package Name] : 원하는 패키지의 소스 코드를 받고자 할때 사용
sudo apt-get build-dep [Package Name] : OnLIne에서 받은 패키지를 ㅏ현시스템의 의존성에 맞게 빌드
sudo apt-cache show [Package Name] : 원하는 패키지들에 대한 상세정보 볼수 있다.

apt-get 으로 설치된 모든 패키지 들은 아래의 경로에 따라 자동 저장이 됩니다. 나중에 설치된 우분투 패키지를
찾고자 할때 유용하게 확인할수 있습니다.

deb package path : /var/cache/apt/archive

'Linux 이야기. > Ubuntu Rocks~' 카테고리의 다른 글

Disk i/O Latency를 측정해 보자 (0)	2013.07.04
Ubuntu Static Network 설정과 온라인 패키지 검색 (0)	2013.04.01

Ubuntu Static Network 설정과 온라인 패키지 검색

2013. 4. 1. 17:18

음..오랬동안 레드햇 계열 리눅스만 써왔던터러 여러가지 스크립트며 설정파일에 대한 부분이 아직은 익숙하지는 않은데 뭐..별다를게 있겠습니까. 시간이 지나면 익숙해 지겠지요.정말 놀라운 것은 문서화가 잘되어 있다는 점도 일단은 마음에 들고 하나둘씩 중요한 것들 정리해 나가면 레드햇 계열처럼 익숙해 지겠죠 ..

1. eth0에 대한 장치정의

일단 우분투의 네트워크 인터페이스 스크립트 파일은 /etc/network/interfaces 파일을

편집해야 하죠

== eth0 Static IP Addre Configuration ==

auto eth0

iface eth0 inet static

address 192.168.1.5

netmask 255.255.255.0

gateway 192.168.1.254

dns-nameservers XXX.XXX.XXX.XXX

== eth0 DHCP IP Address Configuration ==

auto eth0

iface eth0 inet dhp

위와 같이 설정한 후에 네트워크를 내렸다가 다시 올리는 방식으로 재기동 해주면 되는군요

# /etc/init.d/networking restart

2. How to Set up Interfaces

우분투 리눅스의 네트워크 인터페이스에 대한 정의에 대한 파일을 확인해볼 필요도 있을듯 합니다.
스크립트 파일은 /usr/share/doc/ifupdown/examples 파일이 있으며 network-interfaces.gz 파일을 압축을 해제하면 아래와 같은 network interfaces에 대한 구성 예배파일이 나옵니다. 참고해 보시죠

######################################################################

# /etc/network/interfaces -- configuration file for ifup(8), ifdown(8)

# A "#" character in the very first column makes the rest of the line

# be ignored. Blank lines are ignored. Lines may be indented freely.

# A "\" character at the very end of the line indicates the next line

# should be treated as a continuation of the current one.

# The "pre-up", "up", "down" and "post-down" options are valid for all

# interfaces, and may be specified multiple times. All other options

# may only be specified once.

# See the interfaces(5) manpage for information on what options are

# available.

######################################################################

# We always want the loopback interface.

# auto lo

# iface lo inet loopback

# An example ethernet card setup: (broadcast and gateway are optional)

# auto eth0

# iface eth0 inet static

# address 192.168.0.42

# network 192.168.0.0

# netmask 255.255.255.0

# broadcast 192.168.0.255

# gateway 192.168.0.1

# A more complicated ethernet setup, with a less common netmask, and a downright

# weird broadcast address: (the "up" lines are executed verbatim when the

# interface is brought up, the "down" lines when it's brought down)

# auto eth0

# iface eth0 inet static

# address 192.168.1.42

# network 192.168.1.0

# netmask 255.255.255.128

# broadcast 192.168.1.0

# up route add -net 192.168.1.128 netmask 255.255.255.128 gw 192.168.1.2

# up route add default gw 192.168.1.200

# down route del default gw 192.168.1.200

# down route del -net 192.168.1.128 netmask 255.255.255.128 gw 192.168.1.2

# A more complicated ethernet setup with a single ethernet card with

# two interfaces.

# Note: This happens to work since ifconfig handles it that way, not because

# ifup/down handles the ':' any differently.

# Warning: There is a known bug if you do this, since the state will not

# be properly defined if you try to 'ifdown eth0' when both interfaces

# are up. The ifconfig program will not remove eth0 but it will be

# removed from the interfaces state so you will see it up until you execute:

# 'ifdown eth0:1 ; ifup eth0; ifdown eth0'

# BTW, this is "bug" #193679 (it's not really a bug, it's more of a

# limitation)

# auto eth0 eth0:1

# iface eth0 inet static

# address 192.168.0.100

# network 192.168.0.0

# netmask 255.255.255.0

# broadcast 192.168.0.255

# gateway 192.168.0.1

# iface eth0:1 inet static

# address 192.168.0.200

# network 192.168.0.0

# netmask 255.255.255.0

# "pre-up" and "post-down" commands are also available. In addition, the

# exit status of these commands are checked, and if any fail, configuration

# (or deconfiguration) is aborted. So:

# auto eth0

# iface eth0 inet dhcp

# pre-up [ -f /etc/network/local-network-ok ]

# will allow you to only have eth0 brought up when the file

# /etc/network/local-network-ok exists.

# Two ethernet interfaces, one connected to a trusted LAN, the other to

# the untrusted Internet. If their MAC addresses get swapped (because an

# updated kernel uses a different order when probing for network cards,

# say), then they don't get brought up at all.

# auto eth0 eth1

# iface eth0 inet static

# address 192.168.42.1

# netmask 255.255.255.0

# pre-up /path/to/check-mac-address.sh eth0 11:22:33:44:55:66

# pre-up /usr/local/sbin/enable-masq

# iface eth1 inet dhcp

# pre-up /path/to/check-mac-address.sh eth1 AA:BB:CC:DD:EE:FF

# pre-up /usr/local/sbin/firewall

# Two ethernet interfaces, one connected to a trusted LAN, the other to

# the untrusted Internet, identified by MAC address rather than interface

# name:

# auto eth0 eth1

# mapping eth0 eth1

# script /path/to/get-mac-address.sh

# map 11:22:33:44:55:66 lan

# map AA:BB:CC:DD:EE:FF internet

# iface lan inet static

# address 192.168.42.1

# netmask 255.255.255.0

# pre-up /usr/local/sbin/enable-masq $IFACE

# iface internet inet dhcp

# pre-up /usr/local/sbin/firewall $IFACE

# A PCMCIA interface for a laptop that is used in different locations:

# (note the lack of an "auto" line for any of these)

# mapping eth0

# script /path/to/pcmcia-compat.sh

# map home,*,*,* home

# map work,*,*,00:11:22:33:44:55 work-wireless

# map work,*,*,01:12:23:34:45:50 work-static

# iface home inet dhcp

# iface work-wireless bootp

# iface work-static static

# address 10.15.43.23

# netmask 255.255.255.0

# gateway 10.15.43.1

# Note, this won't work unless you specifically change the file

# /etc/pcmcia/network to look more like:

# if [ -r ./shared ] ; then . ./shared ; else . /etc/pcmcia/shared ; fi

# get_info $DEVICE

# case "$ACTION" in

# 'start')

# /sbin/ifup $DEVICE

# ;;

# 'stop')

# /sbin/ifdown $DEVICE

# ;;

# esac

# exit 0

# An alternate way of doing the same thing: (in this case identifying

# where the laptop is is done by configuring the interface as various

# options, and seeing if a computer that is known to be on each particular

# network will respond to pings. The various numbers here need to be chosen

# with a great deal of care.)

# mapping eth0

# script /path/to/ping-places.sh

# map 192.168.42.254/24 192.168.42.1 home

# map 10.15.43.254/24 10.15.43.1 work-wireless

# map 10.15.43.23/24 10.15.43.1 work-static

# iface home inet dhcp

# iface work-wireless bootp

# iface work-static static

# address 10.15.43.23

# netmask 255.255.255.0

# gateway 10.15.43.1

# Note that the ping-places script requires the iproute package installed,

# and the same changes to /etc/pcmcia/network are required for this as for

# the previous example.

# Set up an interface to read all the traffic on the network. This

# configuration can be useful to setup Network Intrusion Detection

# sensors in 'stealth'-type configuration. This prevents the NIDS

# system to be a direct target in a hostile network since they have

# no IP address on the network. Notice, however, that there have been

# known bugs over time in sensors part of NIDS (for example see

# DSA-297 related to Snort) and remote buffer overflows might even be

# triggered by network packet processing.

# auto eth0

# iface eth0 inet manual

# up ifconfig $IFACE 0.0.0.0 up

# up ip link set $IFACE promisc on

# down ip link set $IFACE promisc off

# down ifconfig $IFACE down

# Set up an interface which will not be allocated an IP address by

# ifupdown but will be configured through external programs. This

# can be useful to setup interfaces configured through other programs,

# like, for example, PPPOE scripts.

# auto eth0

# iface eth0 inet manual

# up ifconfig $IFACE 0.0.0.0 up

# up /usr/local/bin/myconfigscript

# down ifconfig $IFACE down

3. 온라인 패키지 검색

일반적으로 레드햇 계열의 리눅스 운영체제에서는 yum search 기능을 많이 쓰고는 했는데 데비안
계열인 우분투에서는 다음과 같은 명령어를 날리면 아름답게 deb 패키지를 검색해 오네요.우분투
업무 시스템을 운영하는 운영자 입장에서는 아~~~~주 유용할듯..

# apt-cache [Package Name]

'Linux 이야기. > Ubuntu Rocks~' 카테고리의 다른 글

Disk i/O Latency를 측정해 보자 (0)	2013.07.04
우분투의 apt-get (Advanced Packaging Tool) 을 이용해 보자. (0)	2013.04.03

튜닝을 위한 리눅스 커널 파라메터들

2013. 3. 29. 09:32

요즘들어 금융권에 대한 성능 튜닝 이슈가 많이 요구되는 듯 하여 정리해서 올려봅니다.

물론 운영되는 어플리케이션에 대한 특성과 환경에 고려되어야 하겠지만 일반적으로 금융

Low-latency 환경을 요구하는 시스템의 경우에는 Shared memory 또는 세마포어

값을 민감하게 건들어야 하는 시스템의 경우는 아래와 같이 사용하기 합니다.

[수정후]
title Red Hat Enterprise Linux Server LL (2.6.32-220.4.2.el6.x86_64)
        root (hd0,0)
        kernel /vmlinuz-2.6.32-220.4.2.el6.x86_64 ro root=UUID=dd05c407-d62c-40b1-800e-ddff20d33062 rd_NO_LUKS rd_NO_LVM LANG=en_US.UTF-8 rd_NO_MD quiet SYSFONT=latarcyrheb-sun16 rhgb crashkernel=auto KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM nohpet intel_idle.max_cstate=0 processor.max_cstate=1 cgroup_disable=memory mce=ignore_ce transparent_hugepage=never nmi_watchdog=1 elevator=deadline idle=mwait nohz=off
        initrd /initramfs-2.6.32-220.4.2.el6.x86_64.img

[수정내용]
기존 kernel 구문에
intel_idle.max_cstate=0 processor.max_cstate=1 cgroup_disable=memory mce=ignore_ce transparent_hugepage=never nmi_watchdog=1 elevator=deadline idle=mwait nohz=off
추가

[각 항목에 대한 설명]
* intel_idle.max_cstate=0 : Processor C-Status 기능 사용한함.
* processor.max_cstate=1 : 프로세서 절전 상태로 진입하지 않게 함.
* cgroup_disable=memory : 자원의 QoS를 제공하는 cgroup기능 사용안함.
* mce=ignore_ce : HW의 corrected error scan으로 인한 latency spike 무시.
* transparent_hugepage=never : hugepage 사용하지 않음
* nmi_watchdog=1 : kdump를 사용설정
* elevator=deadline : cfq가 기본값인 I/O 스케쥴러를 deadline으로 변경
* idle=mwait : CPU idle시 대기처리를 조금만 수행하도록 함
* nohz=off : CPU의 c-status 진입방지

2. 기존 sysctl.conf의 추가 및 변경 사항

#### Kdump 사용 설정
net.ipv4.conf.lo.force_igmp_version = 2
net.ipv4.conf.eth0.force_igmp_version = 2
net.ipv4.conf.eth1.force_igmp_version = 2
net.ipv4.conf.eth4.force_igmp_version = 2
kernel.unknown_nmi_panic = 1
kernel.panic_on_unrecovered_nmi = 1
kernel.panic_on_io_nmi = 1

#### 각 네트워크의 multicast 트래픽에 대한 filtering을 off
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 0
net.ipv4.conf.lo.rp_filter = 0
net.ipv4.conf.eth0.rp_filter = 0
net.ipv4.conf.eth1.rp_filter = 0
net.ipv4.conf.eth4.rp_filter = 0

#### 10G NIC tuning (SolarFlare에 문의해서 low latency에 적합한 튜닝값을 적용)
net.ipv4.neigh.default.unres_qlen = 100
net.ipv4.neigh.lo.unres_qlen = 100
net.ipv4.neigh.eth0.unres_qlen = 100
net.ipv4.neigh.eth1.unres_qlen = 100
net.ipv4.neigh.eth4.unres_qlen = 100
net.ipv4.tcp_low_latency = 1
net.ipv4.tcp_sack = 0
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_mem = 16777216 16777216 16777216
net.ipv4.tcp_rmem = 4096 8388608 16777216
net.ipv4.tcp_wmem = 4096 8388608 16777216
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_default = 8388608
net.core.wmem_max = 16777216

### LAN에서의 arp 요청에 대한 응답을 하지 않게 설정하여 network latency를 줄여줌
net.ipv4.conf.lo.arp_ignore = 1
net.ipv4.conf.eth0.arp_ignore = 1
net.ipv4.conf.eth1.arp_ignore = 1
net.ipv4.conf.eth4.arp_ignore = 1

### 세마포어 최적값 수정
kernel.shmmni = 4096
kernel.sem = 2000 32000 512 5029

### swap 사용비율 설정
vm.swappiness = 10

### 스케쥴러 nr 수정
kernel.sched_nr_migrate = 12

net.core.netdev_max_backlog = 20000
net.core.optmem_max = 25165824
#net.core.optmem_max = 20480

3. tuned를 사용한 Low Latency 프로파일 적용
OS를 사용되는 용도에 맞게 사전 tuning된 프로파일을 제공하는 tuned를 이용해서
레드햇에서 권고하는 시스템 전반적인 tuning을 할 수 있으며 아래와 같이 tuned를 이용해서 적용.

# tuned-adm profile latency-performance

4. 파일시스템 Tuning
[수정내역]
/etc/fstab의 각 마운트 구문중 defaults 뒤에 noatime,nodiratime,nobarrier 옵션 추가

[각 옵션 설명]
* noatime : filesystem meta정보에 file의 access time 기록하지 않음
* nodiratime : filesystem meta정보에 Directory의 access time 기록하지 않음
* nobarrier : fsync기능을 사용하지 않도록 설정.

5. 서비스 실행 사용자 계정 File open 수 제한 설정
[수정내역]
/etc/limit.conf 파일에
사용자명 - nofile 4096
# 서비스 실행 사용자의 File Open 수를 4096으로 증

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

Partition “X” does not end on cylinder boundary (0)	2013.10.01
서버가 응답이 없어요..INfo:task <process>:<pid> lock for more than 120 seconds" (0)	2013.04.04
리눅스 3.7 커널 릴리즈: TCP Fast Open, vxLan 지원 (0)	2012.12.28
재부팅없이 SCSI를 인식시키는 방법 (0)	2012.12.14
SCSI 정보를 알아보자 (0)	2012.12.11

리눅스 3.7 커널 릴리즈: TCP Fast Open, vxLan 지원

2012. 12. 28. 00:59

리눅스 3.7 커널 릴리즈: TCP Fast Open, vxLan 지원

리눅스 3.7 커널이 12월10일에 릴리즈 되었다. 이번 릴리즈는 ARM 64비트 아키텍쳐를 지원하고 서명된 커널 모듈, Btrfs 파일시스템 업데이트, strace 후의 새로운 "perf trace" 도구, 서버측면의 TCP Fast Open 기능, 안정적인 NFS 4.1 등 많은 기능이 업데이트 되었다.

눈에 띄는 변화는 ARM 멀티 플랫폼과 64 비트 지원이다. 싱글 ARM 커널 이미지로 여러 하드웨어에서 부팅이 가능해 졌다. ARM 플랫폼을 지원하도록 배포할때 더욱 쉬워지게 된 것이다.

보안적으로는 서명된 커널모듈을 지원해서, 올바르게 서명되지 않은 모듈은 커널에 로드하지 못하도록 한 것이다. 이 기능은 루트킷등을 이용해 모듈을 추가할 수 없도록 해 보안적으로도 상당히 유용한 기능이다.

네트워크 관점으로 보면 서버 측면에서 TCP Fast Open 을 지원한다. 이미 이 기능은 3.6 커널에서 추가된 것이나 이번 릴리즈는 서버측을 지원하는 것이다. TCP 연결을 맺을때 "Fast Open" 은 더욱 최적화하여 페이지 로드시 4% ~ 41% 정도 속도 향상을 가져온다. 물론 사용되는 환경마다 다르지만 말이다. 이 기능은 추후 블로그에서 다시 한번 소개할 예정이다.

또한 SMBv2 를 지원하고(시험적인 기능) NFS 4.1 이 오랜 기간을 끝내고 드디어 안정 버전으로 릴리즈 되었다. NFS 4.1 의 주 기능은 pNFS 라 하여 패러랠 NFS 을 지원한다. 이로 인해 분산된 여러 서버들 간의 파일시스템에 패러랠한 접근이 가능하다. UDP 프로토콜을 통해 레이어 2 이더넷 패킷을 전송할 수 있는 터널링 프로토콜 vxlan 을 지원한다. vxlan 은 가상화된 환경에서 터널링된 네트워크 환경을 지원하는데 주로 이용된다. VLAN 은 4096 개로 제한되었지만 이것은 24bit 로 확장되어 훨씬 많은 개수를 지원하게 된다. 이것또한 블로그에서 추후 언급할 예정이다.

마지막으로 네트워킹 관점의 변화된 주요 내용은 다음과 같다

loopback: set default MTU to 64K (commit)
Providing protocol type via system.sockprotoname xattr of /proc/PID/fd entries (commit)
Use a per-task frag allocator (commit)
Netfilter
- Add protocol-independent NAT core (commit)
- Add IPv6 MASQUERADE target (commit)
- Add IPv6 NETMAP target (commit)
- Add IPv6 REDIRECT target (commit)
- Add IPv6 NAT support (commit)
- Support IPv6 in FTP NAT helper (commit)
- Support IPv6 in IRC NAT helper (commit)
- Support IPv6 in SIP NAT helper (commit)
- Support IPv6 in amanda NAT helper (commit)
- Add stateless IPv6-to-IPv6 Network Prefix Translation target (commit)
- Remove xt_NOTRACK (commit)
Near Field Communication (NFC): Add an Link Layer Control (LLC) Core layer to HCI (commit), add an shdlc llc module to llc core(commit), LLCP raw socket support (commit)
bonding: support for IPv6 transmit hashing (and TCP or UDP over IPv6), bringing IPv6 up to par with IPv4 support in the bonding driver (commit)
team: add support for non-Ethernet devices (commit)
gre: Support GRE over IPv6 (commit), add GSO support (commit), add GRO capability (commit)
packet: Diag core and basic socket info dumping (commit)
ethtool: support for setting MDI/MDI-X state for twisted pair wiring (commit)
ppp: add 64-bit stats (commit)
Add generic netlink support for tcp_metrics (commit)

[참고]

1. 리눅스 커널 3.7 릴리즈

http://kernelnewbies.org/Linux_3.7

=================================

리눅스 커널이 3.7로 릴리즈 되면서 네트워크 기능이 상당이 강력해 졌네.. 오호라..~~ ^^

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

서버가 응답이 없어요..INfo:task <process>:<pid> lock for more than 120 seconds" (0)	2013.04.04
튜닝을 위한 리눅스 커널 파라메터들 (0)	2013.03.29
재부팅없이 SCSI를 인식시키는 방법 (0)	2012.12.14
SCSI 정보를 알아보자 (0)	2012.12.11
Linux System Hangup을 방지하라. (0)	2012.11.26

재부팅없이 SCSI를 인식시키는 방법

2012. 12. 14. 14:22

리눅스 운영체제에서 재부팅 없이 SCSI 디스크를 인식 시키기 위해서는 아래와 같은 절차를 통해서

진행한다. 운영하고 있는 시스템을 재부팅 하면서 까지 그럴필요는 없쥐.. 암암..

It is possible to add or remove a SCSI device explicitly, or to re-scan an entire SCSI bus without rebooting a running system. Please see the Online Storage Reconfiguration Guide for a complete overview of this topic on Red Hat Enterprise Linux 5.

For Red Hat Enterprise Linux 5

With fibre attached storage, it is possible to issue a LIP (loop initialization primitive) on the fabric:

echo "1" > /sys/class/fc_host/host#/issue_lip

Issuing a LIP (above) on Red Hat Enterprise Linux 5 is all that is needed to rescan fibre

attached storage. Once the LIP is issued, the bus scan may take a few seconds to complete.

Although present on previous releases, this feature is only fully supported in Red Hat Enterprise Linux 5.

For Red Hat Enterprise Linux 4 and 5

With other SCSI attached storage, a rescan can be issued:

echo "- - -" > /sys/class/scsi_host/host#/scan

Replace the # with the number of the SCSI bus to be rescanned.

In addition to re-scanning the entire bus, a specific device can be added or deleted for some

versions or Red Hat Enterprise Linux as specified below.

For Red Hat Enterprise Linux 4 or 5

To remove a single existing device explicitly

# echo 1 > /sys/block//device/delete

For Red Hat Enterprise Linux 3, 4, or 5

To add a single device explicitly:

# echo "scsi add-single-device " > /proc/scsi/scsi

To remove a device explicitly:

# echo "scsi remove-single-device " > /proc/scsi/scsi

Where are the host, bus, target, and LUN numbers for the device,as reported in /sys (2.6 kernels only) or /proc/scsi/scsi or dmesg.

These numbers are sometimes refered to as "Host", "Channel", "Id", and "Lun" in Linux tool output and documentation.

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

튜닝을 위한 리눅스 커널 파라메터들 (0)	2013.03.29
리눅스 3.7 커널 릴리즈: TCP Fast Open, vxLan 지원 (0)	2012.12.28
SCSI 정보를 알아보자 (0)	2012.12.11
Linux System Hangup을 방지하라. (0)	2012.11.26
패킷 Overrun 으로 인한 Frame loss (0)	2012.11.23

SCSI 정보를 알아보자

2012. 12. 11. 11:12

리눅스 시스템에서 FC를 이용한 스토리지 볼륨을 사용할 경우 전반적인 SCSI 정보를 Gathering

할필요가 있다. WWPN 이라든지, HBA 카드 정보라든지, SCSI 연결 정보에 대한 전체적인

정보를 어떻게 가져오는지 다음과 같이 진행해 보자..

1. SYSFS를 통한 WWPN 검색

/sys/class/scsi_host/host1/state 를 아래와 같이 검색했을 경우 각 Host 별로 FC Link

현황에 대한 상태 정보를 확인할수 있다.

# cat /sys/class/scsi_host/host1/state

Link Up - Ready:

Fabric

For qlogic devices (qla2xxx driver) the output would instead

be as follows:

2. SYSTOOL 를 통한 WWPN 검색

SYSTOOL은 RedHat Enterprise Linux 계열의 운영체제에서만 사용할수 있는 Tool 이다.

SYSTOOL을 사용하기 위해서는 아래와 같이 설치 절차를 진행한다.

[root@/]yum -y install sysutils --> SysTool Packag에 systool 명령어가

포함되어 있다.

To examine some simple information about the Fibre Channel HBAs in a machine:

# systool -c fc_host -v

To look at verbose information regarding the SCSI adapters present on a system:

# systool -c scsi_host -v

To see what Fibre Channel devices are connected to the Fibre Channel HBA cards:

# systool -c fc_remote_ports -v -d

For Fibre Channel transport information:

# systool -c fc_transport -v

For information on SCSI disks connected to a system:

# systool -c scsi_disk -v

To examine more disk information including which hosts are connected to which disks:

# systool -b scsi -v

Furthermore, by installing the sg3_utils package it is possible to use the sg_map command to view more information about the SCSI map. After installing the package, run:

# modprobe sg

# sg_map -x

Finally, to obtain driver information, including version numbers and active parameters, the following commands can be used for the lpfc and qla2xxx drivers respectively:

# systool -m lpfc -v

# systool -m qla2xxx -v

Original Source : https://access.redhat.com/knowledge/solutions/9936

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

리눅스 3.7 커널 릴리즈: TCP Fast Open, vxLan 지원 (0)	2012.12.28
재부팅없이 SCSI를 인식시키는 방법 (0)	2012.12.14
Linux System Hangup을 방지하라. (0)	2012.11.26
패킷 Overrun 으로 인한 Frame loss (0)	2012.11.23
Linux Bonding ARP option (0)	2012.11.06

Linux System Hangup을 방지하라.

2012. 11. 26. 11:38

일반적인 Enterprise System에서는 System Hang-up을 방지하기 위한 NMI 스위치를 제공하기는하나
x86_64 시스템의 경우 NMI 신호처리를 운영체제에서 해주는 것이 일반적인다. 특히 리눅스의 경우
커널 파라메터에서 NMI 옵션을 적용함으로 인해서 System Hangup을 미연해 방지해주는 경우도 있다.

- Check Point 1 -

레드햇이나, 수세 리눅스 처럼 엔터프라이즈 환경에 운영되어야 하는 리눅스의 경우 NMI의 옵션이
기본적으로 Enable 되어 있다. 이 사실을 모르는 상태에서 H/W에서 NMI 신호를 받았을때 시스템이
재부팅 되었을 경우 당황해 하는 경우가 있다. 아래의 Virtual File system에서 NMI 신호에 대한 옵션을 확인해 보자.

cat /proc/sys/kernel/nmi_watchdog

위의 경우 처럼 /proc/sys/kernel/nmi_watchdog을 쿼리했을때 옵션 값이 1 로 나올경우 커널옵션과
상관없이 NMI 리눅스 상에서 동작하게 된다.

- Check Point 2 -

NMI 신호는 H/W에서도 옵션 설정이 가능하다, 벤더별로 조금씩 차이는 있을수 있겠지만. 일반적으로는 BIOS단에서 NMI 신호를 보내는 경우도 있다.

NMI Control Disable / Enable

다음은 Linux 상에서 NMI 처리 및 Interrupt 신호와 관련된 설정과 옵션처리는 아래와 같이 진행한다.

==============================

Using the NMI Watchdog to detect hangs

When the NMI watchdog is enabled, the system hardware is programmed to periodically generate an NMI. Each NMI invokes a handler in the Linux kernel to check the count of certain interrupts. If the handler detects that this count has not increased over a certain period of time, it assumes the system is hung. It then invokes the panic routine. If Kdump is enabled, the routine also saves a crash dump.

Determining whether the NMI Watchdog is enabled

To determine whether NMI watchdog is enabled, enter the following command. The included output indicates that there is no NMI count in all processors, thus NMI watchdog is disabled on this system:

# grep NMI /proc/interrupts 
NMI:    0    0    0    0

Enabling the NMI watchdog

To enable the NMI watchdog, add nmi_watchdog=1 or nmi_watchdog=2 to your boot entry.

Note: Not all hardware supports the nmi_watchdog=1 boot parameter. Some hardware supports the nmi_watchdog=2 parameter, and some hardware supports neither parameter.

Edit the /boot/grub/menu.lst file to add the nmi_watchdog=1 parameter or the nmi_watchdog=2parameter to your boot entry. This example shows the /boot/grub/menu.lst file in Red Hat Enterprise Linux:

title Red Hat Enterprise Linux Server (2.6.18-128.el5) 
        root (hd0,0) 
        kernel /vmlinuz-2.6.18-128.el5 ro root=/dev/sda nmi_watchdog=1 
        initrd /initrd-2.6.18-128.el5.img

Reboot the machine. Enter the grep command repeatedly to view the NMI count. The following output shows that the NMI count on each processor increases rapidly. Thus the NMI watchdog is enabled by thenmi_watchdog=1 boot parameter.

# grep NMI /proc/interrupts 
NMI:    2123797    2123681    2123608    2123535 
# grep NMI /proc/interrupts 
NMI:    2124855    2124739    2124666    2124593 
# grep NMI /proc/interrupts 
NMI:    2125981    2125865    2125792    2125719 
# grep NMI /proc/interrupts 
NMI:    2126692    2126576    2126503    2126430 
# grep NMI /proc/interrupts 
NMI:    2127406    2127290    2127217    2127144

The following output shows the NMI count on each processor when the NMI watchdog is enabled by thenmi_watchdog=2 kernel boot option. The NMI counts increase slowly because the count depends on processor utilization, and the system in this example is idle.

Note: There are exceptional cases where a small number of NMI appears in the /proc/interrupts file when NMI watchdog is disabled.

grep NMI /proc/interrupts


NMI:       187       107        293       199
grep NMI /proc/interrupts
NMI:       187       107        293       199
grep NMI /proc/interrupts
NMI:       187       107        293       200

Now your system is ready to generate a crash dump in case it becomes unresponsive, but does not go into a panic state.

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

재부팅없이 SCSI를 인식시키는 방법 (0)	2012.12.14
SCSI 정보를 알아보자 (0)	2012.12.11
패킷 Overrun 으로 인한 Frame loss (0)	2012.11.23
Linux Bonding ARP option (0)	2012.11.06
dmidecode를 이용한 시스템 상태확인 (0)	2012.11.02

패킷 Overrun 으로 인한 Frame loss

2012. 11. 23. 14:12

x86 시스템을 운영하면서 과도한 네트워크 부하가 발생하게 되면 해당 서비스를 진행하는 이더넷
인터페이스 에서는 Packet Overrun이 발생될수 있다. 패킷 Overrun 이 발생되면, incomming/Outgoing 되는 패킷들의 Frame Loss 가 발생되기 때문에 여러모로 모니터링이 필요한
항목이기도 하다.

1. 패킷 Overrrun의 원인

지금까지 프레임 손실 (Frame Loss) 에 대한 가장 일반적인 이유는 큐 오버런입니다.커널은 큐의 길이에 제한을 설정하고 있기 때문에 때로는 대기열의 배수보다 더 빨리 채워집니다. 이 현상이 오래 지속이 되면 Frame Loss 가 발생되는 것입니다.

[root@/ ~]# ip -s -s link ls p4p2

11: p4p2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP qlen 1000

link/ether 00:10:18:ce:0c:0e brd ff:ff:ff:ff:ff:ff

RX: bytes packets errors dropped overrun mcast

3405890854 1534610082 146442 0 146442 13189

RX errors: length crc frame fifo missed

0 0 0 0 0

TX: bytes packets errors dropped carrier collsns

3957714091 1198049468 0 0 0 0

TX errors: aborted fifo window heartbeat

0 0 0 0

2. Network Recive Path Diagram

네트워크의 패킷을 받아들이는데 경로 다이아그램 (Path Diagram)을 이해한다면 장애처리시 물리적인 장치에 대한 Trouble Shooting 이나 운영체제 대한 트러블 슈팅에 대한 판단을 쉽게 진행할수 있다.

리눅스 커널은 패킷을 받아들이는데 위의 그림과 같이 기본적으로 4가지 경로를 거치게 된다.

Hardware Reception: the network interface card (NIC) receives the frame on the wire. Depending on its driver configuration, the NIC transfers the frame either to an internal hardware buffer memory or to a specified ring buffer.
Hard IRQ: the NIC asserts the presence of a net frame by interrupting the CPU. This causes the NIC driver to acknowledge the interrupt and schedule the soft IRQ operation.
Soft IRQ: this stage implements the actual frame-receiving process, and is run insoftirq context. This means that the stage pre-empts all applications running on the specified CPU, but still allows hard IRQs to be asserted.

In this context (running on the same CPU as hard IRQ, thereby minimizing locking overhead), the kernel actually removes the frame from the NIC hardware buffers and processes it through the network stack. From there, the frame is either forwarded, discarded, or passed to a target listening socket.

When passed to a socket, the frame is appended to the application that owns the socket. This process is done iteratively until the NIC hardware buffer runs out of frames, or until the device weight ( dev_weight). For more information about device weight, refer to Section 8.4.1, “NIC Hardware Buffer”
Application receive: the application receives the frame and dequeues it from any owned sockets via the standard POSIX calls ( read, recv, recvfrom). At this point, data received over the network no longer exists on the network stack

5. CPU Affinity

To maintain high throughput on the receive path, it is recommended that you keep

the L2 cache hot. As described earlier, network buffers are received on the same CPU as the IRQ that signaled their presence. This means that buffer data will be on the L2 cache of that receiving CPU.

To take advantage of this, place process affinity on applications expected to receive the most data on the NIC that shares the same core as the L2 cache. This will maximize the chances of a cache hit, and thereby improve performance.

출처) https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-network-packet-reception.html#packet-reception-png

3 운영체제 입장에서의 Work Arround

Queu Overrun이 발생되면 H/W의 Buffer 상태를 check하여 Frame Loss 패킷의 증가가 있는지 확인.

[root@linux] ethtool -S etho | grep frame

rx_frame_error = 0

만약 위의 항목중 rx_frame_error 발생하게 된다면 아래와 같이 진행한다.

Replace ethX with the NIC's corresponding device name. This will display how many frames have been dropped within ethX. Often, a drop occurs because the queue runs out of buffer space in which to store frames.

There are different ways to address this problem, namely:

Input traffic

You can help prevent queue overruns by slowing down input traffic. This can be achieved by filtering, reducing the number of joined multicast groups, lowering broadcast traffic, and the like.

Queue length

Alternatively, you can also increase the queue length. This involves increasing the number of buffers in a specified queue to whatever maximum the driver will allow. To do so, edit the rx/ tx ring parameters of ethX using:

ethtool --set-ring ethX

Append the appropriate rx or tx values to the aforementioned command. For more information, refer to man ethtool.

Device weight

You can also increase the rate at which a queue is drained. To do this, adjust the NIC's device weight accordingly. This attribute refers to the maximum number of frames that the NIC can receive before the softirq context has to yield the CPU and reschedule itself. It is controlled by the

/proc/sys/net/core/dev_weight variable.

Most administrators have a tendency to choose the third option. However, keep in mind that there are consequences for doing so. Increasing the number of frames that can be received from a NIC in one iteration implies extra CPU cycles, during which no applications can be scheduled on that CPU.

참고문서

https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-network-common-queue-issues.html

Issue

A NIC shows a number of overruns in ifconfig output as example below:

    eth0     Link encap:Ethernet  HWaddr D4:AE:52:34:E6:2E  
             UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1

         >>  RX packets:1419121922 [errors:71111] dropped:0 [overruns:71111] frame:0  <<

             TX packets:1515463943 errors:0 dropped:0 overruns:0 carrier:0
             collisions:0 txqueuelen:1000 
             RX bytes:864269160026 (804.9 GiB)  TX bytes:1319266440662 (1.1 TiB)
             Interrupt:234 Memory:f4800000-f4ffffff

Environment

Red Hat Enterprise Linux

Resolution

It is not a problem related with OS. So, this kind of problem is related with infra-structure of a environment, in other words, the network used here seems not support the traffic demand that is necessary.

Workaround

We can use the following steps to try fix this kind of issue.

1 - First of all, we are going to set the network device to work with Jumbo Frame, in other words, improve the MTU size (size of fragmentation packages) that are running in this device.

1.1 - Edit the /etc/sysconfig/network-scripts/ifcfg-ethX file and insert the following parameter: MTU=9000

With the parameter above, the interface and alias will set the value for MTU.

Now, for that settings are applied.

2 - Let's change the Ring Buffer for the highest supported value:

2.1 - Check the values that are set (Current hardware settings:) and What the max value acceptable (Pre-set maximums:) as the example below:

# ethtool -g eth0
Ring parameters for eth0:
Pre-set maximums:
RX:     4096
RX Mini:    0
RX Jumbo:   0
TX:     4096
Current hardware settings:
RX:     256
RX Mini:    0
RX Jumbo:   0
TX:     256

# ethtool -G eth0 rx 4096 tx 4096

# ethtool -g eth0
Ring parameters for eth0:
Pre-set maximums:
RX:     4096
RX Mini:    0
RX Jumbo:   0
TX:     4096
Current hardware settings:
RX:     4096
RX Mini:    0
RX Jumbo:   0
TX:     4096

Above, through the command # ethtool -G eth0 rx 4096 tx 4096 we set the Ring Buffer for the max size supported on my device.

This command line have to be inserted in /etc/rc.local for that can be persistent in case of reboots, because, there are no ways to set this parameter in own device.

3 - By default, some parameters of "auto-tuning" are not set in Linux and, the default size of TCP buffer is very small initially. For 10Gb devices, generaly set the Buffer value to 16Mb is a recommended value. Values above this are only for devices with the capacity greater than 10Gb.

3.1 - Edit the /etc/sysctl.conf file and add the following lines at the end of file:

# Improving the max buffer size TCP for 16MB
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
# Improving the limit of autotuning TCP buffer for 16MB
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
# Max Number of packages that are queued when the
# interface receive the packages faster than the kernel
# can support
net.core.netdev_max_backlog = 250000

Save file and exit.

3.2 - Check what is the collision control algorithm used in yout environment: # sysctl -a | grep -i congestion_control

By default, RHEL uses 'bic' algorithm

# sysctl -a | grep -i congestion_control
net.ipv4.tcp_congestion_control = bic

For this kind of cases of NICs with high speed, is recommended use algorithms as the 'cubic' ou 'htcp'. For some versions of RHEL 5.3 - 5.5 (Kernel 2.6.18) there is a bug that is being fixed with the 'cubic'. So, is recommended use initially the 'htcp' algorithm and check the performance.

At /etc/sysctl.conf file, add the following parameter:

net.ipv4.tcp_congestion_control=htcp

Save file and exit

After we did this, apply the changes made in the sysctl.conf file with the following command: # sysctl -p

Root Cause

Overrun is a number of times that a NIC is unable to transmit the received data in buffer, because the transfer rate of Input exceeded the environment capacity to receive the data. It usually is a signal of excessive traffic.

Each interface has two buffers (queues) with a determinated size, one for transmit data and other to receive data (packages). When one of these queues 'fill', the surplus packages are discarded as 'overruns'. In this case, the NIC is trying receive or transmit more packages than the environment can support.

Diagnostic Steps

The output of ifconfig showing 'errors' and 'overruns' packages.
There is an article in cisco.com explaining more about the network issues:

Troubleshooting Ethernet - Table 4-6 show interfaces ethernet Field Descriptions

    overrun:  Shows the number of times that the receiver hardware was incapable of
handing received data to a hardware buffer because the input rate exceeded
the receiver's capability to handle the data.

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

SCSI 정보를 알아보자 (0)	2012.12.11
Linux System Hangup을 방지하라. (0)	2012.11.26
Linux Bonding ARP option (0)	2012.11.06
dmidecode를 이용한 시스템 상태확인 (0)	2012.11.02
RHEL6 KVM에서 브릿지설정하기 (0)	2012.06.12

PREV 1 2 3 4 5 NEXT

Rehoboth.. 이곳에서 부터

Linux 이야기.

리눅스 정기점검 스크립트

'Linux 이야기. > 유용한 쉘스크립트' 카테고리의 다른 글

서버가 응답이 없어요..INfo:task <process>:<pid> lock for more than 120 seconds"

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

우분투의 apt-get (Advanced Packaging Tool) 을 이용해 보자.

'Linux 이야기. > Ubuntu Rocks~' 카테고리의 다른 글

Ubuntu Static Network 설정과 온라인 패키지 검색

'Linux 이야기. > Ubuntu Rocks~' 카테고리의 다른 글

튜닝을 위한 리눅스 커널 파라메터들

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

리눅스 3.7 커널 릴리즈: TCP Fast Open, vxLan 지원

리눅스 3.7 커널 릴리즈: TCP Fast Open, vxLan 지원

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

재부팅없이 SCSI를 인식시키는 방법

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

SCSI 정보를 알아보자

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

Linux System Hangup을 방지하라.

Using the NMI Watchdog to detect hangs

Determining whether the NMI Watchdog is enabled

Enabling the NMI watchdog

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

패킷 Overrun 으로 인한 Frame loss

Issue

Environment

Resolution

Workaround

Root Cause

Diagnostic Steps

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

+ Recent posts

티스토리툴바

Linux 이야기.

'Linux 이야기. > 유용한 쉘스크립트' 카테고리의 다른 글

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

'Linux 이야기. > Ubuntu Rocks~' 카테고리의 다른 글

'Linux 이야기. > Ubuntu Rocks~' 카테고리의 다른 글

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

리눅스 3.7 커널 릴리즈: TCP Fast Open, vxLan 지원

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

Using the NMI Watchdog to detect hangs

Determining whether the NMI Watchdog is enabled

Enabling the NMI watchdog

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

Issue

Environment

,Resolution

Workaround

Root Cause

Diagnostic Steps

'Linux 이야기. > LInux Article.' 카테고리의 다른 글

티스토리툴바

Resolution