OT Logging That Matters: Less Noise, More Signal
“If everything is an alert, nothing is.” In the world of Operational Technology (OT), this adage rings especially true. OT teams, managing everything from manufacturing lines to energy grids, don’t need more logs flooding their systems. They need the right logs: those that cut through the clutter to highlight real threats. The ultimate goal? Faster threat detection with minimal disruption to daily operations. In an environment where downtime can cost millions, effective logging isn’t just a nice-to-have; it’s a lifeline for resilience.
Admission - OT Logging Is Hard
OT environments are notoriously tricky for logging. First, legacy devices dominate the landscape. Many industrial control systems (ICS) and programmable logic controllers (PLCs) were built decades ago, with limited or no built-in logging capabilities. Upgrading them isn’t straightforward and retrofitting sensors or software could void warranties or introduce instability.
Vendor constraints add another layer of complexity. OT vendors often prioritize safety and reliability over security features, fearing that extensive logging might interfere with real-time processes. Enabling logs on a PLC could inadvertently slow down critical operations, like a conveyor belt in a factory.
Noise reduction is about smart tuning, not elimination.
Then there’s the nuance of OT protocols. Unlike IT networks, OT traffic includes “normal weirdness”, like sporadic bursts of data from sensors or irregular communications that look suspicious but are benign. Misinterpreting these can lead to alert fatigue.
Finally, staffing shortages exacerbate the issue. OT teams are often lean, with limited personnel to triage endless noise. Security analysts might not understand the operational context, leading to false positives that erode trust in the system.
Better Definition of the “Signals That Matter”
So, what are we really trying to catch? Effective OT logging focuses on high-impact signals: unauthorized remote access, privilege misuse, the appearance of new devices on the network, and unauthorized configuration or logic changes in controllers.
Abnormal command patterns are key red flags. Think of a valve opening unexpectedly or a pump speed ramping up without reason. Or, unexpected communications across security zones, like data flowing from a Level 3 enterprise network into a Level 1 control system. This could indicate lateral movement by attackers. By zeroing in on these, logging becomes a targeted tool rather than a blunt instrument.
The Top OT Log Sources to Prioritize
To get high-value insights with low drama, focus on these core sources:
- Remote Access Logs: Track VPN connections, jump hosts, and remote support tools. Log authentication attempts, session durations, and actions taken, essential for spotting insider threats or vendor overreach.
- Boundary Controls: Firewall logs at IT/OT interfaces and between Purdue Model zones capture allows and denies. These reveal attempted breaches or policy violations without overwhelming internal telemetry.
- Identity Logs: Authentication events for OT-capable accounts, including admin logins and privilege escalations. This helps detect compromised credentials early.
- Monitoring Platforms: Passive sensors like network taps provide alerts tuned to baseline behaviors, flagging anomalies without active probing that could disrupt operations.
- Engineering Actions: Where supported, log PLC program downloads, firmware updates, and config changes. These are gold for auditing changes that could introduce vulnerabilities.
A 3-Phase OT Logging Strategy
Remember, a major point is to not overwhelm your already busy team. Consider rolling out logging in these 3 key phases.
Phase 1: Foundational Basics
Start with remote access, boundary, and identity logs. These are often easiest to enable and provide immediate visibility into external threats.
Phase 2: Network Visibility
Add OT network monitoring with protocol-aware detections. Tools that understand Modbus, DNP3, or OPC-UA can parse traffic for anomalies, building on Phase 1 data.
Phase 3: Deep Telemetry
Only when ready, incorporate endpoint logs from supported devices. This includes engineering workstations and modern PLCs, but always prioritize safety. As a reminder, test in non-production environments first.
This phased approach ensures quick wins while scaling towards increased security.
Tuning Principles to Reduce the OT Noise
Noise reduction is about smart tuning, not elimination. Of course how you dial this for your organization should be customized to your environment and goals. Overall, here are some tuning principles I apply.
Baseline and Define Normalcy: Map “normal” by site and process schedule. A factory’s night shift might have different patterns than daytime operations. Consider those nuances with care and define normal for your specific scenarios.
Focus on Changes: Alert on deviations and rare events, not constant data streams. A new device appearing is more critical than routine pings.
Correlate Events: Combine signals for confidence. A remote session followed by an unusual command and a new device? That’s a high-priority alert worth investigating.
Prioritize Detections: Develop 10-15 key rules with clear response steps. Avoid generic thresholds; tailor them to your environment for relevance.
Operationalizing through the Noise: Who Does What
To make logging work, define roles clearly. Start by assigning triage owners and if possible, a dedicated OT security analyst, for initial reviews.
Next, establish escalation paths: low-confidence alerts go to operations, while critical ones trigger incident response teams. The key here is to remove as much subjectivity as possible, understanding you can’t preempt every situation and escalation path.
It’s also imperative you make alerts actionable. Include context like asset criticality (e.g., “This PLC controls the main production line”), location, and last known changes.
Finally, develop response playbooks: “For unauthorized access, isolate the session and notify the vendor.” Shared knowledge, clear accountability, and practice won’t make “perfect”, but it will dramatically increase your cyber resilience in OT.
Keep in mind, integration with SIEM or SOAR tools can automate correlation, freeing staff for high-value tasks.
What Now?: Your 30-Day Checklist
Often in our Koniag Cyber articles, we like to provide an achievable checklist, a 30-day “do this now”, that can help you find a place to start, gain momentum, and make significant progress in just one month.
Here’s a practical 30-day plan for increasing signal and reducing noise in OT:
Week 1: Enable session logging for all vendor remote access. Audit existing tools and enforce multifactor authentication.
Week 2: Turn on firewall logging at the IT/OT boundary. Focus on denies first to catch obvious threats.
Week 3: Create a short list of “stop the line” alerts - e.g., unauthorized PLC changes or cross-zone traffic. Test them in a lab setup.
Week 4: Run a mini tabletop exercise using real logs. Simulate “suspicious remote access to OT” and walkthrough responses to build team muscle memory.
Track progress and adjust based on feedback.
Logging Is a Trust Exercise
In the end, good OT logging is about building trust. When logs help operations they prevent downtime and optimize processes, helping you earn adoption from skeptical teams. It’s not just a security mandate; it’s a partnership for resilience. By focusing on fewer, better signals, you transform logging from a burden into a true asset. Start small, tune relentlessly, and watch your OT environment become more secure without the noise.

