✨
Public Notes
  • Personal running notes using Forem
  • Setting up a local test env without docker
  • Setting gitbook to use "main" branch by default
  • User Moderation
    • User Moderation is really just scoring
  • Flaky Specs
    • GithubReadmeTag
    • JS Testing
    • Setup not completed
    • userData is sometimes not loaded
    • Tracking down spec failures
    • Flaky Specs #1
  • Travis CI Segfaults
  • Bookmarks
  • User Setup in development
  • Testing in Docker locally
    • Reapproach quay containers
    • Webdriver and chrome headless
      • Using Selenium standalone
      • Chrome in docker
    • Setting up a local test env with docker
    • Traditional CI container setup
  • Peer Review
  • Rails runaway process in initialize!
    • Don't fix it
    • fix it?
  • Zeitwerk
  • Parallelize Cypress
  • Reading List
  • Sidekiq Memory Use
  • Buildkite configuration
  • Dumping heap
  • Tricking KnapsackPro
  • Debugging Ruby
  • Docker development revisited
  • gitpod development test
  • Embedding HTML in Markdown
Powered by GitBook
On this page

Was this helpful?

Sidekiq Memory Use

PreviousReading ListNextBuildkite configuration

Last updated 3 years ago

Was this helpful?

We have noticed a number of memory issues with the self-hosted forem test instance (a 4GB digital ocean droplet). Initially the assumption was the pressure was coming from ElasticSearch (which is a fairly heavyweight process sitting on the jvm, a cost we couldn't spread to other components since the ruby stack is not using jruby). ES had been removed but the OOM lockups on the test instance continued.

Joe was able to install datadog monitoring in the pod so we could get a view of memory use (identify the problem service) - it turned out to consistently be sidekiq.

I attempted to disable the ActiveRecord cache for sidekiq and we let the process run for a day - the results were disappointing.

Strangely - there's little to no correlation between the "steps" up in the graph and running jobs - when I zoom in closer (1h scale) the step is actually just a larger hump in what looks like a sawtooth function, with the troughs coming 100MB higher than before after the process.

Another look at sidekiq (showing restarts both yesterday and today)

I dumped the service logs for a relevant timespan - the sidekiq allocations during an extended period make little to no sense (this is a quiet instance, there's only the scheduled jobs, they all complete in under 2 seconds).

Possible suspicions:

could fog be allocating and not releasing something when we (fail) to create the sitemap (issue is aws config is required but not present)

could something with unique job or another sidekiq middleware/plugin unrelated to any job be allocating regularly?

What's next?

Is there a job responsible?

Rhymes suggested running the scheduled jobs one by one to see if there's a link between growth over time and some particular job (by disabling all scheduled jobs but one, one by one).

I'm more inclined to run the same jobs 1000 times in a row repeatedly to track that...

schedule = YAML.load(File.open("config/schedule.yml"))

schedule.map {|j| j.second.slice("class", "args") }=> [{"class"=>"Feeds::ImportArticlesWorker"},                      
 {"class"=>"Metrics::RecordBackgroundQueueStatsWorker"},
 {"class"=>"Metrics::RecordDailyUsageWorker"},
 {"class"=>"Metrics::RecordDailyNotificationsWorker"},
 {"class"=>"Metrics::RecordDataCountsWorker"},
 {"class"=>"Metrics::CheckDataUpdateScriptStatuses"},
 {"class"=>"BadgeAchievements::BadgeAwardWorker", "args"=>["", "award_yearly_club", ""]},
 {"class"=>"BadgeAchievements::BadgeAwardWorker", "args"=>["", "award_beloved_comment", ""]},
 {"class"=>"BadgeAchievements::BadgeAwardWorker", "args"=>["", "award_four_week_streak", ""]},
 {"class"=>"BadgeAchievements::BadgeAwardWorker", "args"=>["", "award_eight_week_streak", ""]},
 {"class"=>"BadgeAchievements::BadgeAwardWorker", "args"=>["", "award_sixteen_week_streak", ""]},
 {"class"=>"BadgeAchievements::BadgeAwardWorker", "args"=>["", "award_tag", ""]},
 {"class"=>"BadgeAchievements::BadgeAwardWorker", "args"=>["", "award_contributor_from_github", ""]},
 {"class"=>"HtmlVariants::RemoveOldDataWorker"},
 {"class"=>"Tags::ResaveSupportedTagsWorker"},
 {"class"=>"Listings::ExpireOldListingsWorker"},
 {"class"=>"Broadcasts::SendWelcomeNotificationsWorker"},
 {"class"=>"SitemapRefreshWorker"},
 {"class"=>"BustCachePathWorker", "args"=>["/feed.xml"]},
 {"class"=>"BustCachePathWorker", "args"=>["/badge"]},
 {"class"=>"BustCachePathWorker", "args"=>["/"]},
 {"class"=>"Emails::EnqueueDigestWorker"},
 {"class"=>"Notifications::RemoveOldNotificationsWorker"},
 {"class"=>"Credits::SyncCounterCache"},
 {"class"=>"Podcasts::EnqueueGetEpisodesWorker"},
 {"class"=>"GithubRepos::UpdateLatestWorker"},
 {"class"=>"PushNotifications::CleanupWorker"}]

The badge award worker takes 3 positional args usernames, badge_slug, and message, so that's why passing the slug alone requires two empty string args as well.

Disable active record query cache for sidekiq by djuber · Pull Request #13861 · forem/foremGitHub
Logo
sidekiq process from datadog
datadog sidekiq processes over time