New in KB: How to handle diskfull issues

  • 19 April 2022
  • 5 replies
  • 1108 views

  • Anonymous
  • 0 replies

Occasionally, we encounter issues where one of the mount point in LogPoint is full. The disk full conditions can be because of various reasons and it is of utmost importance to immediately free some disk so that LogPoint can function normally.

Under disk full situations for /opt and primary repo path /opt/immune/storage locations, the log collection of LogPoint will be affected. 

Detection

To detect the disk full situations, we can use the df command. 

df -h

In the output of this command, we can either look for the percent usage of each mount point or the available storage space. These indicators will help us detect disk full scenarios.

Mitigation

Now, once we find out the problem in LogPoint is because of lack of storage space, we can dive deeper.

/opt path has 100% storage used

The /opt mount point generally stores the necessary config files, service log files, mongoDB data and executables of LogPoint. For normal functioning of LogPoint it is critical to have some storage space available. 
Since in normal scenario, this mount point does not actively store much data, it is unlikely to have storage space 100% used. But when it encounters such cases, we have to investigate using du command and find out which directory or file is the cause of disk getting full. The command that helps out is as follows:

du -sch <file_1> <file_2> <directory_1> <directory_2>#To check all files and folder in current working directorydu -sch *

It is important to try this command manually across the directories inside /opt to detect the culprit. Note: /opt/immune/storage is usually mounted to a different pool or lvm. 

Frequently encountered cases 

  1. Storage occupied by old upgrade patch files.
    The old upgrade patch files are stored in /opt/immune/var/patches/installed/ directory. These patch files range from few MBs to hundreds of MBs and they can be reason for /opt being full. These older patch files can be deleted, if we are sure that these old upgrades are successfully installed in LogPoint.
     
  2. Storage occupied by mongoDB data directory
    The mongodb's data directory is /opt/immune/db/. Sometimes the size of the db can be huge when the LogPoint has too many configurations data.
    In that case, please contact LogPoint support. 
     
  3. Storage occupied by service log files
    The service log files are stored in /opt/immune/var/log/ directory. In some cases when some service is in debug mode or due to some errors some log files can swell to unexpected size. In that cases, we have no option but to delete such files. We have to locate such anomalous files and delete them. This can be done by the same du command to check file size.
    Since the content of those files are already indexed into LogPoint's log ingestion pipeline it is fine to delete the service logs. But only do so, if you are sure, else contact LogPoint support to do so.
     
  4. Storage occupied by nxlog dump files 
    We have observed this issue in few customers when nxlog dumps some files in the directory
    /opt/nxlog/var/spool/nxlog/.
    These files might can cause storage full in /opt mount point. So, cleaning the dump files or just moving them to other larger mounts should help. This issue has been addressed by recent version of LPAgent so, please update it to latest one to avoid having this issue.
     

/opt/immune/storage has 100% storage used

Usually /opt/immune/storage mount point has larger storage space compared to /opt because it has to store the logs and indices files as primary retention path. 

If this mount point gets 100% used, then log collection gets halted and related services will stop to function. It is important to fix such issues. To drill down which directory might be using a lot of space, same old du command does the trick.

The probable cases when /opt/immune/storage is full can be as follows:

  1. Storage occupied by logs and indices
    In most of the cases, when /opt/immune/storage is full, this is because of the logs and indices. The logs and indices directory grow in size because of the data stored by LogPoint.

    In normal scenario we would expect disk size to be estimated properly so that, the logs stored will not exceed the provisioned space. Sometimes for some repos however there might be abrupt increase in event rate. In such scenarios we can either decrease the retention for the repos with most amount of data. Otherwise, we need to allocate more disk to accommodate increased log volume.
  2. Storage occupied by buffer directories
    There are some buffer directories which sometime can fill up, due to issues in the LogPoint and that can cause storage full scenarios. These buffer directories can be as follows:
    • /opt/immune/storage/highavailability/ - Issue in the highavailability (HA) functionality.
    • /opt/immune/storage/OldLogsKeeper/ - There are too much old logs coming in to the LogPoint machine.
    • /opt/immune/storage/FileKeepeer - If there is an issue in the indexsearcher service then logs are buffered in this directory.

      If any of the above directory are occupying too large space, then please call support for assistance.

In any of the above situations if you are not sure, it is important to call support for help. The paths mentioned here are for default installations. For some custom changes in the data mount point and so on, the paths might differ.

 

Note: The paths /opt/makalu and /opt/immune paths are actually same because in Logpoint /opt/immune is a soft symlink to /opt/makalu.


5 replies

What account do I need to run these commands under?  I am not able to perform this using the li-admin login.

Hi Stephen,

 

Let me get you help on this :)

@Nils Krumey 

Userlevel 4
Badge +7

Hi @Stephen Barton,

What commands are you referring to specifically? I just checked, and du -sch * should work in pretty much all of the mentioned directories, even as li-admin - the only one that li-admin doesn’t have access to is the nxlog spool directory, but any release higher than 7.0 should have cleaned that up anyway.

The most important one out of all these is obviously /opt/immune/storage, as that is by default where the logs and indexes end up, and most of the time systems simply run out of disk space because there are too many logs. The du command should help to determine whether it is indeed logs or one of the system directories above that have filled up the disk space.

The du command might throw a few errors on some subfolders that li-admin does not have access to (for example, the new SOAR container location in /opt/immune/storage cannot be accessed, but in most cases the overall total should still be reasonably close.

Hi @Nils Krumey ,

 

  1. g. “The old upgrade patch files are stored in /opt/immune/var/patches/installed/ directory. These patch files range from few MBs to hundreds of MBs and they can be reason for /opt being full. These older patch files can be deleted, if we are sure that these old upgrades are successfully installed in LogPoint.”

I try: rm -rf /opt/immune/var/patches/installed/patch-logpoint-6.9.0 and get

rm: cannot remove [...]: permission denied.

User and Group for the upgrade files are “loginspect”.

Userlevel 4
Badge +7

Hi, yes you would need root access (or a partner account) for that - Support should be able to do that for you.

Reply