logo
IRC Archive / Freenode / #nagios / 2010 / March / 05 / 1
[1]Artha
hey all
I am doing check_by_ssh and on the UI I am getting "Usage: check_by_ssh -H <host> -C <command> [-fqv] [-1 "
RomainK
That means you're not passing check_by_ssh the right arguments.
Double check your check_command and make sure it works from the command line as the nagios user.
[1]Artha
I did a test on the command line and found that nagios require GLIBC 2.8, and I am trying to monitor a redhat system that has GLIB 2.5
yum report that is the latest version
is there a way around?
RomainK
[1]Artha: the server you're trying to monitor doesn't need anything.
Why would you install nagios on the server you're monitoring?
[1]Artha
I did not, all I did it copy the check_load to that box, and from the Nagios server I am doing check_by_ssh and execute that command
RomainK
Ah.
They're not the same versions of the OS and you copied a binary over..?
[1]Artha
for the command I am running: $USER1$/check_by_ssh -H $HOSTADDRESS$ -c "/home/nagios/bin/check_load -w $ARG1$ -c $ARG2$"
yea
RomainK
you expected that to work?
Anyway, you need to find a version of check_load that's compiled for that RH version, with the right deps.
[1]Artha
ah
thank you, I have to track that down and see how I can get a version for RH
d_low
hey gang, can anyone help me with pnp4nagios
for some reason im getting squares for my graph labels... I think it might be a localization issue, but im not sure.
found it. missing deja-vu fonts
jmm
hi everyone.
sni
hi
jmm
I wish to improve my nagios setup, the problem is that some of the monitored host seems to have connections troubles, everyday it spam me with host down/host up alerts and notifications.Sometimes it's just for 1 or 2mn, but it can go to like 5.does got some ideas on how I should change my setup so it works better with such hosts ?
rubeus80
Hi I need to take all data from nagios events, including perfdata. nagios.log doesn'f fit our needs...just show changes. I tried status.dat but is not a log. You need to check timestamps for hosts and services. With 500 machines and 10 services per machine this is quite complicated and inneficient...This data must be integrated into a business intelligence app
I had a technical problem and quit...someone answered my previous question? if not sorry for this message
sni
no
but it seems like you are looking for NDO or something similar
rubeus80
sni maybe..just trying ti figure if it was possible with logs
sni
like you said, the logs contain only the changes
rubeus80
no way to configure nagios to create a log with everything?
sni
i dont know your goal, but maybe the performance data written to a file would be enough too
you may enable the debug log :)
rubeus80
lol
suicidal but who knows
sni
what are you trying to achieve?
rubeus80
sni we need to get if there is a change in a service or host state..nagios.log ok with that...but also to get all the performance data in real time (i mean when it's produced)
sni
nagios is capable of writing performance data to a extra logfile
rubeus80
ok
that would do the work
a lof for every service o host with perfdata or a single log?
sni
if thats enough for you, have a look at http://nagios.sourceforge.net/docs/3_0/perfdata.html
rubeus80
sni thank you very much
sni
depends on you configuration
rubeus80
you people are always helpful
:)
sni
especially the last chapter on that page
rubeus80
ok
I'll take a look
partner
seems to be quite quiet shift today :)
Zordrak
i am going crazy trying to work out why sudo nrpe calls are failing on a centos box. looking for help. everything done under an account called nrpe. su - to nrpe and this works fine:
/usr/bin/sudo /usr/lib64/nagios/plugins/check_hddtemp /dev/sda 40 50
nrpe cant do it on its own
running: strace -f /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d
the only thing i can find is that just before the process exits it writes to fd2 (apparantly to nowhere since debug logging doesnt seem to log :/) "sudo: needs to be suid root"
/usr/bin/sudo is --s--x--x, check_hddtemp is root.nrpe rwsr-xr-x
jmm
what does ls -l /usr/bin/sudo tell ?
oh :)
Zordrak
/etc/sudoers is 0440
=everything -seems fine.. not to mention as i said logging in AS nrpe and running it works fine
but the daemon just keeps failing
the sudoers line: nrpe ALL=(root) NOPASSWD:/usr/lib64/nagios/plugins/check_hddtemp,{others}
selinux is disabled completely
hence.. im going crazy
jmm
how does looks your nagios config for that check ?
Zordrak
running manually from all over the place.. /location/of/check_nrpe -H host -c check_temp_sda
nrpe.cfg: command[check_temp_sda]=/usr/bin/sudo /usr/lib64/nagios/plugins/check_hddtemp /dev/sda 40 50
jmm
you could try to chmod +s check_hddtemp, but just to test.
so you can remove the call to sudo, and check if it's ok.
Zordrak
no joy
trying without sudo call
"UNKNOWN: please make sure script is running as root"
jmm
ENOLUCK.
Zordrak
(!)
running strace on the check_nrpe doesnt "seem" to get fed the sudo error.. looks like the connection is just killed which results in the normal "unable to read output". i assume this is just default behaviour when the daemon hits an abnormal exit
whats really baffling me is why it works manually under the nrpe account, but just not from the daemon itself
CapnDan
morning all!
what's the configuration for adding topology to Nagios called?
where you say, this set of hosts is "behind" this host?
(so if this host is down, don't bother alarming on all the stuff behind it)
jmm
CapnDan: look at the parent / child relation.
CapnDan
parents!! perfect, thanks!
jmm
yw.
Zordrak
maybe i might get *somewhere if i can get some log output
its set to log to syslog and debug.. but after the inital load, it doesnt spit a damn thing out to the log
jmm: fixed it
jmm: bloody stupid redhat defaults (hence why nearly everything runs slackware)
jmm: centos' sudoers file has "Defaults requiretty"
so, no tty login, no sudoers
and a stupid unrelated error msc
*msg
seems like i could solve a lot of my issues if i could somehow get nrpe to output STDERR from commands it runs
CapnDan
Whoaaa Nagos hosts map looks so much better with that "parents" topology added!
Valcor
CapnDan: shocking huh?
CapnDan
:-)
It's an iterative process, eh, I got all the important services in, enough to make my cowworkers go "oohhh" and "aaahhh" , now I can start adding the niceties.
The status map doesn't auto refresh or change colours when there are problems, does it?
LzrdKing
it shows when hosts are down
CapnDan
'k thanks.
LzrdKing
and yes, even throwing in dummy parents really makes a difference
(Action) has dummy parents
CapnDan
lol
in this case I have an esx box with windows, linux, and solaris containers on it, and the Solaris container has zones in it, which themselves of course have services, so it works out beautifully.
LzrdKing
the vrml map is cool, but i can;t get to the nagios server from my windows vm to view it and all the vrmol viewers for my mac suck
vrml*
so i just do without it
but seriously, my parents are cool
sni
i would like to add parents, i just don't know them :)
CapnDan
What is that? "3-D Status Map" ?? Mine doesn't work either.
Valcor
something not worh using
requires a plugin for browsers to support VRML... and doesn't really give you anything worth it
LzrdKing
well it looks cool
sni
i would like to have a status map, based on host addresses
LzrdKing
vrml is so 1995
Valcor
LzrdKing: it does?
LzrdKing
Valcor: in my opinion
CapnDan
screenshot of vrml display plz
LzrdKing
can't get one now :)
sni
i build my nagios without
Valcor
I don't think 3.2 includes it any more
sni
its really useless
it's in 3.2.0
but there is no link, you have to go to statuswrl.cgi manually
Kalavera
hello
is there a way to split my configuration in files like a kind of include?
Valcor
Kalavera: yes
http://nagios.sourceforge.net/docs/3_0/configmain.html
cfg_file to load an individual file, or cfg_dir to load the entire directory
but you should see samples of that already in your configuration
Kalavera
so in this way could I have a customer directory with their own host and services files?
CapnDan
Kalavera: sure, you can just include them in your main configuration.
LzrdKing
ohh, using a directory may me helpful to me...
Valcor
cfg_dir=/etc/nagios3/objects
one entry, includes all .cfg files if they're under /etc/nagios3/objects
recursively iirc
what you do under that dir, nagios doesn't care, as long as it has permissions to the files/directories, and they are correct syntax... you're good to go
adurotec
Does anyone know why this is happening on a remote host: http://fpaste.org/LAt1/
Valcor
does the command exist in $HOME/libexec/check_disk?
does the user nagios have rights to execute it?
keith4
holy sh*t
Zordrak
yup
Valcor
mislabelled the libexec dir?
keith4
(Action) blinks
Valcor
adurotec: so does the user nagios, have a libexec directory in their $HOME directory?
adurotec
yes
CapnDan
Pet peeve of the day: explaining to Windows people that I don't have to tweak the memory of the Solaris server every time they start up a new application, or turn off an old one.
"How much phyiscal memory does it have?" "Never mind." "But, i'll be adding..." "I know. Don't worry about it." "But (bla bla bla) two gigs .." "Don't worry about it."
Actually I'm taking entirely the wrong approach, I should be saying, "Thank you for telling me, I'll make the necessary adjustments to Solaris."
adurotec
is it possible to specify a different port for the plugin "check_https_url!10!20!/healthcheck" and if so where is this done?
Valcor
CapnDan: ahh you're finally catching on :)
CapnDan: I've found it pointless arguing the same stuff over and over, and just tell them what they want to hear most of the time
adurotec: you can either edit check_https_url command definition to use a different port, or you can add another argument to pass the port into the check command, or you can create another command
keith4
you should probably create a new one that takes a port parameter
adurotec
so something like -p "$ARG4$" appended to
Valcor
quotes not needed for the port number... but pretty much yes... then your call would append !portnum on the end
LzrdKing
does nagios log sending notifications? i'm not getting them emailed to me anymore and i don;t know why
Valcor
LzrdKing: if you enable it yes
http://nagios.sourceforge.net/docs/3_0/configmain.html
keith4
if it suddenly stopped... it's most likely a sendmail problem
LzrdKing
god, what order are those options in? seems random
is it possible to force nagios to send out a notification for testing?
CapnDan
LzrdKing: what do you mean?
Valcor
LzrdKing: you mean the test option under the service?
"custom service notification"?
CapnDan
LzrdKing: yeah the different check_modules have totally different syntax. Luckily all can be unified by the command{ } magic
LzrdKing
valcor: custom service notification is what i'm looking for, thanks!
keith4
you can just find your notification command, and run it manually
LzrdKing
ugh, no route to host...
hooray! it was a nameserver issue
nagios will once gain be spamming me
CapnDan
burn it out with fire!
LzrdKing
burn out nagios?
CapnDan
no, the nameserver
LzrdKing
i was using internal nameservers instead of external ones
CapnDan
BURN THEM
keith4
having part of your notification process depend on DNS is potentially dangerous
especially if you are bothering to check the DNS service
LzrdKing
of course it depends on dns to send internet email
though it should be able to work locally...
keith4: why dangerous, unless dns goes down?
RomainK
is it 5pm PST yet?
CapnDan
lol not for another 9 hours or so man
RomainK
:(
That's what I was afraid of.
LzrdKing
no, only about 8 hours
CapnDan
hence the "or so" :-)
keith4
LzrdKing: for example... if you had used IP addresses for notification, you wouldn't have had your most recent problem
RomainK
IPs for notification?
that's something I hadn't thought of.
Wouldn't an IP-based smarthost be better?
keith4
yes
RomainK
or is that what you meant?
keith4
that's what i meant
emias
(Action) would usually prefer a local DNS server on the Nagios host.
Alternatively, one could simply decide that DNS won't break, as we do.
Valcor
I'd still use an IP based smart host
RomainK
(Action) has done the DNS slave on nagios server thing before.
But for the most part, I just used IP-based everything.
LzrdKing
i;m using procmail, it looks up the mx record when it sends mail