All notable changes to this project will be documented in this file.
- Added a deb target to the Makefile using gox
- Doc update
-
SHH_META: Off by default. This controls meta stats collection and reporting. Turn this on to get:
- Librato stats (number of gauges, number of counters)
- multi poller's stats
- listen poller's meta stats
- Removed
lo
from the default NIF_DEVICES - pid.last from the load poller. This is useless on a time series plot.
- Bugfix: ETS table memory to report in words not bytes.
- Bugfix: Used memory is a gauge, report it as such.
- Bugfix: Renamed LinearSliceContainsString to SliceContainsString, removing the old implementation, so that we only match on exact matches
- Redis poller.
- Update Go Version to 1.4.
- Travis CI Updates
- Folsom Poller (folsom) now polls for ETS table size and memory.
- Folsom Poller (folsom) now uses native
folsom_cowboy
API calls to determine metric types.
- Folsom Poller (folsom), which adds the ability to fetch metrics from a runinning Erlang system. See the docs for more info.
- Retry all low-level delivery errors
- Fixed a typo in the per-processes rss metric. byts -> bytes
- MetricsNameNormalizer was changed so some metrics names may change.
Old # -> _
- -> _
New # -> .
_ -> -
- Actually catch io.EOFs so we can re-try them. Experience has shown they are most likely dropped connections.
- Splunk Search Peers poller (splunksearchpeers), which adds the ability to monitor Splunk search clusters. See the docs for more info.
- Removed LibratoNetworkTimeout, use NetworkTimeout
(
SHH_NETWORK_TIMEOUT
) instead. - Merged listen poller metrics into 3 (all parse errors are treated
the same) and added prefix
meta
to them all.
- Refactored listen poller stats counters to avoid mutex
- Removed a bunch of useless logging
- handleListenConnection now part of Listen poller instead of bare.
- nagios2stats bugfix
nagios3stats
poller
- Fix reporting of per processes sys / user cpu
- Ignore processes w/o names, likely due to a process exiting between enumerating the directory entries and reading /proc//stat
- Generate additional process stats for processes that match
SHH_PROCESSES_REGEX
. SHH_PROCESSES_REGEX
: \A\z - Regex of process names to poll and extra additional measurements forSHH_TICKS
: 100 - cpu ticks per second. Default should be correct for most systems. seegetconf CLK_TCK
. Temporary until we use cgo to get itSHH_PAGE_SIZE
: 4096 - kernel page size. Default should be correct for most systems. Seegetconf PAGESIZE
. Temporary until we use cgo to get it.
SHH_LIBRATO_ROUND
: true - round measurement times to nearest interval during submission
- Some disk poller stats were incorrectly being reported as Gauges.
SHH_DF_LOOP
introduced to avoid loopback mounts from showing up in df poller (default false)SHH_FULL
introduced to add back full set of metrics from some pollers
- Remove
DEFAULT_SELF_POLLER_MODE
in favor of SHH_FULL="self" SHH_CPU_AGGR
defaults to true, eliminating cpu metrics for all cores by defaultSHH_DF_TYPES
removes tmpfs by default- Default pollers to minimal set. Utilize
SHH_FULL
to get full set of metrics
- Update some defaults:
SHH_INTERVAL=60s
&SHH_LIBRATO_BATCH_TIMEOUT=10s
- Bugfix
- SHH_LIBRATO_BATCH_SIZE defaults to 500
- SHH_LIBRATO_NETWORK_TIMEOUT defaults to 5s
- SHH_LIBRATO_BATCH_TIMEOUT defaults to SHH_INTERVAL
- Librato Outputter timeout doesn't start until there is a measurement
- Report to librato how many guages / counters are being reported in a batch
$SHH_LISTEN_TIMEOUT
now controls the timeout on the socket.- Librato outlet now reports a
User-Agent
header at the request of Librato. - Timeout errors to the librato api are now reported.
- Better handling of ntpdate sub process error messages.
- shh-value cli tool for interacting with the unix socket.
- Latest version of Go (1.3.3) used.
- use github.com/heroku/slog for structured logging (extracted from shh originally).
- LISTEN Poller documentation.
- Improved Listen Poller with support for types and units
- Use of Go's logger (over fmt.Println)
- Units. Measurements now have units which the Librato outputter takes advantage of. No other outputters currently take advantage of this.
- sockstat poller now uses lowercase protocol names in emitted metrics. Previously, it broke convention and used uppercase. (i.e. 'UDP' is now 'udp')
- The percentage calculations for the "mem" and "df" pollers resulted in values between 0 and 1. CPU percentages are between 0-100. They now all use 0-100 instead of 0-1.
Sadly, we didn't keep a proper changelog for previous versions. :(