The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

App::Netdisco::Manual::Troubleshooting - Tips and Tricks for Troubleshooting

Understanding Nodes and Devices

The two basic components in Netdisco's world are Nodes and Devices. Devices are your network hardware, such as routers, switches, and firewalls. Nodes are the end-stations connected to Devices, such as workstations, servers, printers, and telephones.

Devices respond to SNMP, and therefore can report useful information about themselves such as interfaces, operating system, IP addresses, as well as knowledge of other systems via MAC address and ARP tables. Devices are actively contacted by Netdisco during a discover (and other polling jobs such as macsuck, arpnip).

Netdisco discovers Devices using "neighbor protocols" such as CDP and LLDP. We assume your Devices are running these protocols and learning about their connections to each other. If they aren't, you'll need to configure manual topology within the web interface (or simply have standalone Devices).

Nodes, on the other hand, are passive as far as Netdisco is concerned. The only job that contacts a Node is nbtstat, which makes NetBIOS queries. Nodes are learned about via the MAC and ARP tables on upstream Devices.

Because Netdisco only learns about Devices through a neighbor protocol, it's possible to run an SNMP agent on a Node. Only if the Node is also advertising itself via a neighbor protocol will Netdisco treat it as a Device. This can account for undesired behaviour, such as treating a server (Node) as a Device, or vice versa only recognising a switch (Device) as a Node.

To prevent avoid discovery of any target as a Device, use the discover_no, discover_no_type, or discover_only configuration settings. If you don't see links between Devices in Netdisco, it might be because they're not running a neighbor protocol, or for some reason not reporting the relationships to Netdisco. Use the show command to troubleshoot this:

 ~netdisco/bin/netdisco-do show -d 192.0.2.1 -e c_id

Understanding Netdisco Jobs

Please read the section above, if you've not yet done so.

Netdisco has four principal job types:

discover

Gather information about a Device, including interfaces, vlans, PoE status, and chassis components (modules). Also learns about potential new Devices via neighbor protocols and adds jobs for their discovery to the queue.

macsuck

Gather MAC to port mappings from known Devices reporting Layer 2 capability. Wireless client information is also gathered from Devices supporting the 802.11 MIBs.

arpnip

Gather MAC to IP mappings from known Devices reporting layer 3 capability.

nbtstat

Poll a Node to obtain its NetBIOS name.

The actions as named above will operate on one device only. Complimentary job types discoverall, macwalk, arpwalk, and nbtwalk will enqueue one corresponding single-device job for each known device. The Netdisco backend daemon will then process the queue (in a random order).

My Device details look all wrong!

See the tips at Vendors Guide, or else contact the community email list.

Devices are not being discovered

Besides reading the whole of this manual page for general tips, take a look at the "SNMP Connect Failures" report under the Admin menu. Any devices listed have had multiple SNMP connect failures, indicating a possible configuration error on the device or in Netdisco's configuration.

After OS update or upgrade, Netdisco fails

If you upgrade the operating system then your system libraries will change and Netdisco needs to be rebuilt (specifically, C library bindings).

The safest way to do this is set up a new user and follow the same install instructions, connecting to the same database. Stop the web and backend daemon for the old user, and start them for the new user. Then delete the old user account.

Alternatively, if you do not mind the downtime: stop the web and backend daemons then delete the ~netdisco/perl5 directory and reinstall from scratch. The configuration file, database, and MIBs can all be reused in-place.

Run a netdisco-do Task with Debugging

The netdisco-do command has several debug flags which will show what's going on internally. Usually you always add -D for general Netdisco debugging, then -I for SNMP::Info logging and -Q for SQL tracing. For example:

 ~netdisco/bin/netdisco-do discover -d 192.0.2.1 -DIQ

You will see that SNMPv2 community strings are hidden by default, to make the output safe for sending to Netdisco developers. To show the community string, set the SHOW_COMMUNITY environment variable:

 SHOW_COMMUNITY=1 ~netdisco/bin/netdisco-do discover -d 192.0.2.1 -DIQ

Dump an SNMP object for a Device

This is useful when trying to work out why some information isn't displaying correctly (or at all) in Netdisco. It may be that the SNMP response isn't understood. Netdisco can dump any leaf or table, by name:

 ~netdisco/bin/netdisco-do show -d 192.0.2.1 -e interfaces
 ~netdisco/bin/netdisco-do show -d 192.0.2.1 -e Layer2::HP::interfaces

You can combine this with SNMP::Info debugging, shown above (-I).

Interactive SQL terminal on the Netdisco Database

Start an interactive terminal with the Netdisco PostgreSQL database. If you pass an SQL statement in the "-e" option then it will be executed.

 ~netdisco/bin/netdisco-do psql
 ~netdisco/bin/netdisco-do psql -e 'SELECT ip, dns FROM device'
 ~netdisco/bin/netdisco-do psql -e 'COPY (SELECT ip, dns FROM device) TO STDOUT WITH CSV HEADER'

The last example above is useful for sending data to Netdisco developers, as it's more compact and readable than the standard tabular output (second example).

Database Schema Redeployment

The database schema can be fully redeployed (even over an existing installation), in a safe way, using the following command:

 ~netdisco/bin/netdisco-db-deploy --redeploy-all

Debug HTTP Requests and Configuration

You can see HTTP Headers received by Netdisco, and other information such as how it's parsing the config file, by enabling the Dancer debug plugin. First download the plugin:

 ~netdisco/bin/localenv cpanm --notest Dancer::Debug

Then run the web daemon with the environment variable to enable the feature:

 DANCER_DEBUG=1 ~/bin/netdisco-web restart

A side panel appears in the web page with debug information. Be sure to turn this off when you're done (stop and start without the environment variable) otherwise secrets could be leaked to end users.

Change the SNMP commnuity string for a Device

If you change the SNMP community string in use on a Device, and update Netdisco's configuration to match, then everything will continue to work fine.

However, if the Device happens to support two community strings then Netdisco can become "stuck" on the wrong one, as it caches the last-known-good community string to improve performance. To work around this, delete the device (either in the web GUI or using netdisco-do at the command line), and then re-discover it.

Installation on SLES 11 SP4

Try running the following command for installation:

 curl -L http://cpanmin.us/ | CFLAGS="-DPERL_ARGS_ASSERT_CROAK_XS_USAGE" perl - --notest --local-lib ~/perl5 App::Netdisco