Monday 20 February 2017

linux-4.10-ck1, MuQSS version 0.152 for linux-4.10

Announcing a new -ck release, 4.9-ck1  with new version of the Multiple Queue Skiplist Scheduler, version 0.150. These are patches designed to improve system responsiveness and interactivity with specific emphasis on the desktop, but configurable for any workload.

linux-4.10-ck1

-ck1 patches:
http://ck.kolivas.org/patches/4.0/4.10/4.10-ck1/

Git tree:
https://github.com/ckolivas/linux/tree/4.10-ck

Ubuntu 16.10 packages (sorry I'm no longer on 16.04):
http://ck.kolivas.org/patches/4.0/4.9/4.10-ck1/Ubuntu16.10/

MuQSS

Download:
4.10-sched-MuQSS_152.patch

Git tree:
4.10-muqss


MuQSS 0.152 updates

Removed the rapid ramp-up in schedutil cpufreq which was overactive.
Bugfixes

4.10-ck1 updates

Apart from resyncing with the latest tree from linux-bfq:
- The wb-buf-throttling patches are now part of mainline and do not need to be added separately
- Minor swap setting tweaks

For those of you trying to build the evil nvidia driver for linux-4.10, this patch will help:
nvidia-375.39-linux-4.10.patch

Enjoy!
お楽しみ下さい
-ck

90 comments:

  1. Nice low latency you got there ;).
    Also thanks for the nvidia patch, couldn't find one before.

    ReplyDelete
  2. @CK I have weird SCHED logs in dmesg

    #3
    [ 0.753442] SCHED: No cpumask for kworker/4:0/36

    ...
    [ 0.131197] TSC deadline timer enabled
    [ 0.131200] smpboot: CPU0: Intel(R) Core(TM) i7-3610QM CPU @ 2.30GHz (family: 0x6, model: 0x3a, stepping: 0x9)
    [ 0.131243] Performance Events: PEBS fmt1+, IvyBridge events, 16-deep LBR, full-width counters, Intel PMU driver.
    [ 0.131263] ... version: 3
    [ 0.131264] ... bit width: 48
    [ 0.131264] ... generic registers: 4
    [ 0.131265] ... value mask: 0000ffffffffffff
    [ 0.131266] ... max period: 00007fffffffffff
    [ 0.131266] ... fixed-purpose events: 3
    [ 0.131266] ... event mask: 000000070000000f
    [ 0.201364] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
    [ 0.221236] smp: Bringing up secondary CPUs ...
    [ 0.291257] SCHED: No cpumask for kworker/1:0/18
    [ 0.301260] SCHED: No cpumask for kworker/1:0H/19
    [ 0.301289] x86: Booting SMP configuration:
    [ 0.301290] .... node #0, CPUs: #1
    [ 0.453414] SCHED: No cpumask for kworker/2:0/24
    [ 0.453427] SCHED: No cpumask for kworker/2:0H/25
    [ 0.453441] #2
    [ 0.603439] SCHED: No cpumask for kworker/3:0/30
    [ 0.603452] SCHED: No cpumask for kworker/3:0H/31
    [ 0.603463] #3
    [ 0.753442] SCHED: No cpumask for kworker/4:0/36
    [ 0.753457] SCHED: No cpumask for kworker/4:0H/37
    [ 0.753466] #4
    [ 0.903475] SCHED: No cpumask for kworker/5:0/42
    [ 0.903488] SCHED: No cpumask for kworker/5:0H/43
    [ 0.903499] #5
    [ 1.053497] SCHED: No cpumask for kworker/6:0/48
    [ 1.053509] SCHED: No cpumask for kworker/6:0H/49
    [ 1.053521] #6
    [ 1.203507] SCHED: No cpumask for kworker/7:0/54
    [ 1.203521] SCHED: No cpumask for kworker/7:0H/55
    [ 1.203530] #7
    [ 1.353452] smp: Brought up 1 node, 8 CPUs
    [ 1.353454] smpboot: Total of 8 processors activated (36719.67 BogoMIPS)
    [ 1.360342] MuQSS locality CPU 0 to 1: 2
    [ 1.360343] MuQSS locality CPU 0 to 2: 2
    [ 1.360343] MuQSS locality CPU 0 to 3: 2
    [ 1.360344] MuQSS locality CPU 0 to 4: 1
    [ 1.360344] MuQSS locality CPU 0 to 5: 2
    [ 1.360345] MuQSS locality CPU 0 to 6: 2
    [ 1.360345] MuQSS locality CPU 0 to 7: 2
    [ 1.360346] MuQSS locality CPU 1 to 2: 2
    [ 1.360347] MuQSS locality CPU 1 to 3: 2
    [ 1.360347] MuQSS locality CPU 1 to 4: 2
    [ 1.360347] MuQSS locality CPU 1 to 5: 1
    [ 1.360348] MuQSS locality CPU 1 to 6: 2
    [ 1.360348] MuQSS locality CPU 1 to 7: 2
    [ 1.360349] MuQSS locality CPU 2 to 3: 2
    [ 1.360350] MuQSS locality CPU 2 to 4: 2
    [ 1.360350] MuQSS locality CPU 2 to 5: 2
    [ 1.360350] MuQSS locality CPU 2 to 6: 1
    [ 1.360351] MuQSS locality CPU 2 to 7: 2
    [ 1.360352] MuQSS locality CPU 3 to 4: 2
    [ 1.360352] MuQSS locality CPU 3 to 5: 2
    [ 1.360353] MuQSS locality CPU 3 to 6: 2
    [ 1.360353] MuQSS locality CPU 3 to 7: 1
    [ 1.360354] MuQSS locality CPU 4 to 5: 2
    [ 1.360354] MuQSS locality CPU 4 to 6: 2
    [ 1.360355] MuQSS locality CPU 4 to 7: 2
    [ 1.360355] MuQSS locality CPU 5 to 6: 2
    [ 1.360356] MuQSS locality CPU 5 to 7: 2
    [ 1.360356] MuQSS locality CPU 6 to 7: 2
    [ 1.360570] devtmpfs: initialized
    ...

    FULL dmesg:https://pastebin.com/raw/9YbMTik5

    CONFIG: https://github.com/FadeMind/linux410-custom.src/blob/master/linux410/config.x86_64

    Regards

    FadeMind

    ReplyDelete
    Replies
    1. I put them there much like the MuQSS locality messages. They're harmless and for my information.

      Delete
  3. Thanks Con! I built and ran x64 and i686-UP on Arch last night; working fine.

    ReplyDelete
    Replies
    1. Yes, thanks!
      No problems so far.

      Delete
  4. This comment has been removed by the author.

    ReplyDelete
  5. Too bad there isn't a nvidia 340.x driver for 4.10 as of yet.

    ReplyDelete
  6. Have a look at the end of Cons -ck1 announcement for 4.10, he provided a link to a nvidia driver patch.
    Peter

    ReplyDelete
    Replies
    1. Yes, I saw it, but it is for the 375.x driver series. Thanks anyway.

      Delete
    2. First hit on google search:

      https://devtalk.nvidia.com/default/topic/982052/linux/latest-nvidia-driver-340-101-builds-compiles-properly-but-fails-to-load-has-errors-with-linux-kernel-4-9-resolved-with-patch-/

      search terms: "nvidia 340 4.10 kernel"

      Delete
  7. As with the 4.9.0 -ck, I am getting huge spikes in several CPU monitors while the system is actually idle process-wise. `top` sees my CPUs at '100% si' almost constantly, `xosview` displays 100% "SYS" spikes in a second's interval or less, and XFCE4's xfce4-systemload-plugin shows the CPU at 100% constantly. I set CONFIG_HZ=300. Any hints how to get a usable CPU load monitoring again?

    ReplyDelete
    Replies
    1. I have the same problem with 4.10-muqss branch.

      Delete
    2. Con, any hints on how to track this down, and possibly fix this?

      Delete
    3. It's an accounting error (it's not actually using extra CPU.) Unless you can hack the code and fix it, there's nothing more you can do until I find time to investigate and fix it (which alas won't be any time soon.)

      Delete
    4. Thanks for the heads-up. I unfortunately cannot fix it myself, but as long as it is on the list, I'm a happy camper. :)

      Delete
  8. Great work once more on updating MuQSS. Personally I think it's a great scheduler. I've been getting very impressive results from it when combined with the schedutil governor and using yield_type 2, interactive 1 and rr_interval 1.

    Not only is the system incredibly responsive but performance seems to be the best as well, like that. Mileage may vary for other people but I could not be happier.

    ReplyDelete
    Replies
    1. +1 All of the above.
      It is the best, no doubt.

      Delete
  9. echo 1 > /proc/sys/kernel/rr_interval nice low latency for "real-time" audio work.
    Thanks a lot.

    ReplyDelete
    Replies
    1. Astonishing. And this doesn't hurt throughput in any way? In my former testings, some years and kernels ago, setting to 1 did not only affect disk i/o negatively, but also gfx and audio being not "in-time" asap.

      Are you using the full feature -ck1 or the MuQSS only patch?

      BR, Manuel Krause

      Delete
    2. Yes, it hurts throughput.

      But when the CPU is fast it takes some "abuse" to reach that point.

      On a slow CPU it might be not that fun since it might be choking all the time when the value is too low.

      ck1.

      Delete
  10. @27 February 2017 at 05:30:

    Actually, I'm running rr_interval 1, interactive 1 and yield_type 2 and whereas one might expect that to hurt throughput, from some testing (both synthetic as well as real world) I've actually found that throughput seems to be BETTER than with, for example, rr_interval 6, interactive 0 and yield_type 1 (or 0).

    I suspect this has to do with more and more applications as well as OS subsystems becoming increasingly multithreaded and the overhead of the context switching (yield type 2 and rr_interval 1) being less than the overhead of threads simply waiting for other threads to complete their tasks.

    Something along those lines anyhow.

    Just to give you an idea -- running a demoscene demo (synthetic metric, obviously) in WINE sees a 12% difference for me between running the highly cooperative mode (yield 2, interval 1, interactive 1) and the highly selfish mode (yield 0, interval 6+, interactive 0). In favour of the cooperative approach.

    ReplyDelete
    Replies
    1. @Anon,

      please specify whether (in Your tests) You use performance / ondemand / powersave governor and which of cpufreq or p-state You actually use.
      As mentioned in different threads here and there, ondemand vs performance itself is a big win, at least on non p-state capable hw, if You get 12% out of performance that's neat and worth a try :)

      Delete
    2. Schedutil. Been a fan of that one since it was first implemented. Tried it with ondemand as well and even that was a performance degradation. Performance might be on-par with schedutil but I'd hardly wager it being better.

      Delete
    3. Obviously I meant that the 'Performance' governor might be on-par with the 'schedutil' governor.

      Delete
  11. If you have an Intel CPU, I would be cautious about schedutil.

    I've tested it with CFS on my Intel 4770k, with both acpi-cpufreq and intel_pstate (by adding intel_pstate=passive to kernel boot line, a new option of 4.10).

    It is broken: the CPU frequency is always locked at the maximum turbo frequency (3.9GHz in my case), and the performance is bad with acpi-cpufreq (I didn't benchmarked intel_pstate).

    I've not tried MuQSS with schedutil.

    Pedro

    ReplyDelete
    Replies
    1. I use "intel_pstate=disable intel_idle.max_cstate=0 idle=poll nohalt" on Intel CPUs for maximum performance.

      Delete
    2. pstate passive + schedutil scales correctly for me but the performance is way lower than cpufreq + schedutil
      i dont know why

      Delete
    3. pstate + schedutil + mux = almost always max standard freq for me on skylake here. Not usable at all.
      pstate or cpufreq both are ok separately.

      Br, Eduardo

      Delete
  12. Thanks Con for this new release.

    Here are the usual throughput benchmarks on 4.10:
    https://docs.google.com/spreadsheets/d/163U3H-gnVeGopMrHiJLeEY1b7XlvND2yoceKbOvQRm4/edit?usp=sharing

    Pedro

    ReplyDelete
    Replies
    1. Is it possible if you could re-run the latency tests (Interbench). I am wondering if it's even worth running MuQSS because the throughput is probably better with CFS but I am not sure about the latencies I've been using both schedulers and I can't find any difference latency wise while my workload consists of compiling large projects like LLVM/Chromium while programming, I haven't noticed anything slowing down even with CFS.

      Delete
    2. Latency-wise CFS is a turtle while MUQSS is a rabbit. HTH.

      Delete
    3. But then you have the catch: MuQSS aims for low latency vs. CFS.

      BR, Manuel Krause

      Delete
    4. Added the interbench results.

      Pedro

      Delete
    5. I've done some latency oriented tests with runqlat and cyclictest, on MuQSS152@100Hz and CFS@300Hz.
      This time I hope i get it right.
      Charts are at the bottom of the sheets.

      The cpu is loaded with a linux kernel build at various -j.
      During the build runqlat or cyclictest are run with the following command lines :
      'runqlat -m 180 1'
      'cyclictest -q -S -m -D3m -H=40000'

      I also ran cyclictest at +1 nice level, as suggested by some doc I read.

      Overall MuQSS show much higher average latencies under high load, but lower max latencies under runqlat.

      Maybe ck or someone else can comment on these.
      Is it expected ? Are the tests not measuring the right thing ?

      Pedro

      Delete
    6. Again it's not testing what you think it's testing. Try changing the yield proc tunable and you'll see the results will change.

      Delete
    7. Additionally the function it hooks into aren't exactly the same so the results are never going to be directly comparable.

      Delete
    8. Thanks for replying.

      The thing is, I try to back up the positive comments I read on MuQSS latency with figures, as I don't feel any difference between MuQSS and CFS with my workload.
      I guess it's not that easy.

      I'll try changing the sched_yield setting.

      I had looked at cyclictest source code and didn't see any call to sched_yield, so I thought it was the right tool to compare CFS and MuQSS.
      Well, I just don't understand this scheduling stuff :(

      Pedro

      Delete
    9. I've done some testing with yield_type setting.
      It doesn't make real differences with this workload (build kernel).
      I won't draw conclusions from that.

      Pedro

      Delete
  13. maybe http://www.brendangregg.com/blog/2017-03-16/perf-sched.html could help further tweaking of MUQSS. HTH.

    ReplyDelete
  14. Running any WINE application hardlockups my system (either on execution or given a period of time). Before, my workaround was to use SCHED_ISO for pulseaudio, jackdbus, and osu!.exe with SCHED_NORMAL for wine and winserver. However, simply tuning yield_type to 0 fixes this.

    Running CS:GO with yield_type 0 or 1 both showed hard stuttering when loading player or bot threads with multicore rendering enabled. Tuning yield_type to 2 removes this stutter and I am assuming it applies to source games. Thank god you made it tunable.

    After using the system for a considerable while, or playing osu! enough to reach this error, I still come across NOHZ: local_softirq_pending 202. Usually on CFS, the warning goes away with no apparent problems on the system. On MuQSS when the warning appears, the entire system lags. Specifically, display is not always updated, mouse stutters and does not poll correctly, and keyboard input is delayed and occassionally does not poll correctly. This issue goes away when I am able to set or run any program with policy SCHED_ISO (and keep it running) or restart with nohz=off at runtime.

    All of this was tested on ck-ivybridge from the repo-ck repository with an Intel i5-3317U. The minimal workarounds are really stable, with the only thing worrying me is idle power consumption from disabling idle dynticks. Apart from that, the kernel is awesome to work and play with.

    ReplyDelete
    Replies
    1. And I spoke too soon. Workarounds and tunables above do still significantly delay it from hardlocking the system.

      The only usecase I found to guarantee hardlocking on my system is using yield_type=2 and rr_interval=1, running a wine program using GL/EGL/GLES and opening 1-20 terminals at once.

      I'm pretty confident it's the realtime scheduling issue mentioned before and that all programs that use or bridge to OpenGL on wine coincidentally demands realtime scheduling. htop shows this, but schedtool says otherwise. I'm beginning to think wine is coded to shit.

      In the event that the wine program becomes a zombie, wineserver -k and schedtool -I the parent/child process relevant to the zombie kills the process (??). Using schedtool -R in the same process hardlocks the system OR puts the cpu to an unworkable idle state with softirq warnings.

      I've also stopped rtkit-daemon to see if it helps, but to no avail. I really don't want to make a huge list of programs to SCHED_NOT_RR in schedtoold on this incredibly responsive kernel.

      Delete
    2. Looks like SCHED_BATCH for wine, wineserver, and wine-preloader is the most stable setup for me. Realtime priority programs have no problem running for extended periods of time and wine mostly spawns children at SCHED_NORMAL. Not sure what to make of it.

      Apart from my longstanding issues, latency-wise:

      - primusrun and nvidia-xrun with intel_cpufreq schedutil makes all Valve games I've played open several seconds faster and leaves me with unbelievably low mouse latency on an Optimus system compared to mainline and Windows.

      - I/O detection for my external keyboard and mouse is really fast and never fails to register compared to the few times that happened on mainline.

      - Dota 2 on CFS caps at 30 FPS after reaching a specific load from multiple unit selection (even though it can run well above this on Pause). MuQSS does not have this issue.

      - TTY switching is noticeably faster.

      Delete
    3. Looks like SCHED_BATCH for wine, wineserver, and wine-preloader is the most stable setup for me. Realtime priority programs have no problem running for extended periods of time and wine mostly spawns children at SCHED_NORMAL. Not sure what to make of it.

      Apart from my longstanding issues, latency-wise:

      - primusrun and nvidia-xrun with intel_cpufreq schedutil makes all Valve games I've played open several seconds faster and leaves me with unbelievably low mouse latency on an Optimus system compared to mainline and Windows.

      - I/O detection for my external keyboard and mouse is really fast and never fails to register compared to the few times that happened on mainline.

      - Dota 2 on CFS caps at 30 FPS after reaching a specific load from multiple unit selection (even though it can run well above this on Pause). MuQSS does not have this issue.

      - TTY switching is noticeably faster.

      Delete
  15. I've benchmarked CFS, MuQSS and CK1 with the Phoronix Test Suite. Last time I did this, MuQSS was still in development.

    http://openbenchmarking.org/result/1704053-RI-410CFSVSM76
    http://openbenchmarking.org/result/1704069-RI-GAMING41095

    I've also updated my google spreadsheet with various yield_type settings and CK1. It's a bit messy though.

    Throughput wise :
    from the PTS results, there is no clear winner. It depends on the workload.
    from the spreadsheet, I would say the best MuQSS setting is "interactive=0" and "interactive=1 & yield_type=0 or 1". CK patchset is slower.

    Pedro

    ReplyDelete
  16. Looks like SCHED_BATCH for wine, wineserver, and wine-preloader is the most stable setup for me. Realtime priority programs have no problem running for extended periods of time and wine mostly spawns children at SCHED_NORMAL. Not sure what to make of it.

    Apart from my longstanding issues, latency-wise:

    - primusrun and nvidia-xrun with intel_cpufreq schedutil makes all Valve games I've played open several seconds faster and leaves me with unbelievably low mouse latency on an Optimus system compared to mainline and Windows.

    - I/O detection for my external keyboard and mouse is really fast and never fails to register compared to the few times that happened on mainline.

    - Dota 2 on CFS caps at 30 FPS after reaching a specific load from multiple unit selection (even though it can run well above this on Pause). MuQSS does not have this issue.

    - TTY switching is noticeably faster.

    ReplyDelete
    Replies
    1. Thanks for the positive feedback too! Good to hear of some concrete examples of advantages.

      Delete
    2. A suggestion if you're having lockups with -ck is it might be worth building the kernel without threadirqs enabled. It could be a subtle driver priority inversion bug that only shows up with threaded irqs and since they're off by default in mainline they wouldn't be picked up.

      Delete
    3. Oh, I did not notice this comment, along with an old mailing list concerning wine(server) priority inversion. I did notice less jack2 xruns with this off, but never used it long enough to reach a conclusion. I will test this out.

      Delete
    4. Without threadirqs, its pretty stable. I haven't seen any xruns reported from jack2 despite leaving Cadence on for a few hours with moderate workloads compared to w/ threadirqs' occasional pops.

      SCHED_BATCH nice 19 wineserver is still the most stable policy. It also solves freezing issues I had with Ragnarok Online on wine-staging CSMT that I had with CFS. Only lockup I've reached so far is with hibernate from low battery which is a very rare use-case for me.

      It's still a mystery why I haven't found your mailing list on wineserver priority inversion sooner, but at least I reached the same conclusion.

      Delete
  17. Hi Con,

    Linux-ck experiences crashes with Docker. I've posted description here https://bbs.archlinux.org/viewtopic.php?pid=1704251 - can you have a look

    ReplyDelete
    Replies
    1. I haven't experienced any crashes with docker on my system yet with BFQ enabled and I usually leave ziahamza's webui-aria2 docker container running for days.

      Delete
    2. What is your kernel and docker versions?

      Delete
    3. Docker version 17.04.0-ce, build 4845c567eb

      4.10.10-1-ck-ivybridge #1 SMP PREEMPT

      I have ran docker a few times on ck-generic and linux-ck from AUR with no problems (from 4.9.11+).

      Have you tried using mainline with BFQ to see if its a bug specific to the I/O scheduler?

      Delete
  18. Hi Con

    I am playing with ck-1 patch together with Nvidia videocard (Optimus if it matters) under 4.10.10.
    Unfortunately, it freezes GUI after sometime.
    All processes continue to work, only display stops refreshing.
    Is it a known issue?

    BR

    ReplyDelete
    Replies
    1. Sheesh, why is everyone so reserved with posting details ?

      what driver version, what card, what other hardware components ?

      system details !

      Nvidia works fine here (but NOT optimus - so it could be that)

      Delete
    2. Blog comments is bad place for bug reports, I guess.

      Hardware - laptop with Intel(R) Core(TM) i7-3630QM, GeForce GT 640M LE + Intel Corporation 3rd Gen Core (Optimus). Kernel is 4.10.10 vanilla (Gentoo distribution).
      I tried nvidia-drivers-381.09 and 375.39 with Con patch + xf86-video-intel-2.99.917_p20170313. Both combinations eventually freeze laptop.

      Delete
    3. @kernelOfTruth: I also experience GUI freezes on 4.10.x series, but not on 4.9.x. I have Asus laptop with Optimus (UX303UB) and those freezes are with 375.39, 378.13 and 381.09 drivers (Gentoo Linux here).

      Delete
    4. kernelOfTruth9 May 2017 at 04:19

      @Денис , @mbar

      >Blog comments is bad place for bug reports, I guess.

      Indeed :/


      by "GUI freezes" you mean that the X server locks up screen content and it doesn't change anymore,

      only a forced reboot (or Magic SYSRQ Key) works ?

      (all comments suggest so)

      I got bash / terminal content freezing and it only gets updated when switching between apps - back and forth; kwin_x11 without compositing

      but that's obviously not the same that you are experiencing,

      did you try using a different window manager or desktop environment if that prevents it from happening ?

      Does disabling frequency switching (ondemand governor, etc.) and switching to "performance" or using intel pstate make a change ?

      This would at least help to further cycle it down to specifics and allow you to continue work somehow ...

      Delete
    5. > by "GUI freezes" you mean that the X server locks up screen content and it doesn't change anymore,
      only a forced reboot (or Magic SYSRQ Key) works ?

      Yes. Exactly.

      > did you try using a different window manager or desktop environment if that prevents it from happening ?
      No, I haven't changed anything. The regression is related to kernel.

      > Does disabling frequency switching (ondemand governor, etc.) and switching to "performance" or using intel pstate make a change ?

      Again, no. I've tried nvidia-drivers-375.20 and 381.09 and that's it. After freezes I went back to 4.8-ck1.

      Delete
    6. Try to use BFQ v8r11. There's patches on their website for vanilla 4.10, alternatively you can update version with patches from sirlucjan (linux-rt-bfq).

      Delete
    7. nvidia drivers 375.66 and 381.22 have fixes to prevent deadlock issues with PRIME Sync. It might be worth a try to use those drivers if it's specific to 4.10.

      Delete
    8. Thanks.
      It's valuable comment.
      For those interested, see full announcement - https://devtalk.nvidia.com/default/topic/1007268/b/t/post/5141478/#5141478

      Delete
  19. I'm hitting a BUG when trying to create a QEMU/KVM VM, any ideas? I saw similar BUG in previous user comments regarding to BFS for 4.8, where person hit same BUG when he tried using VirtualBox.

    Apr 18 20:08:35 ROG audit[5962]: AVC apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-f02a7ff8-d128-4db2-
    Apr 18 20:08:35 ROG kernel: audit: type=1400 audit(1492564115.688:55): apparmor="STATUS" operation="profile_replace" profile="unconfined"
    Apr 18 20:08:35 ROG kernel: usercopy: kernel memory overwrite attempt detected to ffff9b05d3ece708 (kmalloc-8) (128 bytes)
    Apr 18 20:08:35 ROG kernel: ------------[ cut here ]------------
    Apr 18 20:08:35 ROG kernel: kernel BUG at /usr/src/linux-4.10.0/mm/usercopy.c:75!
    Apr 18 20:08:35 ROG kernel: invalid opcode: 0000 [#1] SMP
    Apr 18 20:08:35 ROG kernel: Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 b
    Apr 18 20:08:35 ROG kernel: cryptd snd_hwdep snd_pcm intel_cstate nvidia(POE) intel_rapl_perf snd_seq_midi saa7164 snd_seq_midi_event sn
    Apr 18 20:08:35 ROG kernel: multipath linear uas usb_storage hid_generic usbhid hid raid0 i915 i2c_algo_bit drm_kms_helper syscopyarea s
    Apr 18 20:08:35 ROG kernel: CPU: 0 PID: 4052 Comm: libvirtd Tainted: P OE 4.10.0-19+my-generic #21
    Apr 18 20:08:35 ROG kernel: Hardware name: ASUS All Series/MAXIMUS VII GENE, BIOS 3003 10/28/2015
    Apr 18 20:08:35 ROG kernel: task: ffff9b05aa5c5300 task.stack: ffffaaf342f30000
    Apr 18 20:08:35 ROG kernel: RIP: 0010:__check_object_size+0x77/0x1d6
    Apr 18 20:08:35 ROG kernel: RSP: 0018:ffffaaf342f33ee0 EFLAGS: 00010282
    Apr 18 20:08:35 ROG kernel: RAX: 000000000000005e RBX: ffff9b05d3ece708 RCX: 0000000000000000
    Apr 18 20:08:35 ROG kernel: RDX: 0000000000000000 RSI: ffff9b05efa0dbc8 RDI: ffff9b05efa0dbc8
    Apr 18 20:08:35 ROG kernel: RBP: ffffaaf342f33f00 R08: 0000000000000005 R09: 0000000000000551
    Apr 18 20:08:35 ROG kernel: R10: 0000000000000008 R11: ffffffffa84469cd R12: 0000000000000080
    Apr 18 20:08:35 ROG kernel: R13: 0000000000000000 R14: ffff9b05d3ece788 R15: ffff9b05d3ece708
    Apr 18 20:08:35 ROG kernel: FS: 00007f410c5d1700(0000) GS:ffff9b05efa00000(0000) knlGS:0000000000000000
    Apr 18 20:08:35 ROG kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Apr 18 20:08:35 ROG kernel: CR2: 00007f41196eaaa0 CR3: 00000003f0da5000 CR4: 00000000001406f0
    Apr 18 20:08:35 ROG kernel: Call Trace:
    Apr 18 20:08:35 ROG kernel: SyS_sched_setaffinity+0x6b/0xe0
    Apr 18 20:08:35 ROG kernel: entry_SYSCALL_64_fastpath+0x1e/0xad
    Apr 18 20:08:35 ROG kernel: RIP: 0033:0x7f41188425dc
    Apr 18 20:08:35 ROG kernel: RSP: 002b:00007f410c5d0798 EFLAGS: 00000246 ORIG_RAX: 00000000000000cb
    Apr 18 20:08:35 ROG kernel: RAX: ffffffffffffffda RBX: 00007f41193e271c RCX: 00007f41188425dc
    Apr 18 20:08:35 ROG kernel: RDX: 00007f40e81211e0 RSI: 0000000000000080 RDI: 000000000000174c
    Apr 18 20:08:35 ROG kernel: RBP: 00007f40e83155d0 R08: 00007f40e81de0e0 R09: 0000000000000000
    Apr 18 20:08:35 ROG kernel: R10: 00007f40e81211e0 R11: 0000000000000246 R12: 00007f40e83155d0
    Apr 18 20:08:35 ROG kernel: R13: 00007f41196eaa90 R14: 0000000000000001 R15: 00007f410c5d1698
    Apr 18 20:08:35 ROG kernel: Code: c7 c2 13 4f ed a7 48 c7 c6 d1 da e9 a7 48 c7 c7 60 a5 e9 a7 48 0f 44 d1 48 c7 c1 8a 2e e9 a7 48 0f 44 f
    Apr 18 20:08:35 ROG kernel: RIP: __check_object_size+0x77/0x1d6 RSP: ffffaaf342f33ee0
    Apr 18 20:08:35 ROG kernel: ---[ end trace 7f5e3e96a69c8802 ]---

    ReplyDelete
    Replies
    1. I have met this kind of issue too. causing some of my program cannot execute. like winecfg.

      hope it will be fix soon

      Delete
    2. Hardened usercopy related (CONFIG_HARDENED_USERCOPY_PAGESPAN)

      https://patchwork.kernel.org/patch/9181869/

      https://lkml.org/lkml/2017/1/15/152

      could be scheduler (MuQSS) or something totally else ...

      Delete
    3. I think that MuQSS is incompatible with CONFIG_CPUMASK_OFFSTACK, which is implied by CONFIG_MAXSMP ("Configure maximum number of SMP processors and NUMA Nodes"). sched/core.c get_user_cpu_mask bounds the copy length to cpumask_size() which is a runtime value when CONFIG_CPUMASK_OFFSTAK, but MuQSS's version bounds it to sizeof(cpumask_t) which will, in this case, probably be larger than the actual target buffer. Refer to Linux commit 96f874e26428a (from 2008). I think MuQSS needs to either handle this case, or require !CONFIG_CPUMASK_OFFSTACK.

      Delete
    4. That's very helpful information. Thank you very much.

      Delete
  20. Hey,

    I've had this issue (more like an annoyance) since I've been using linux 4.10 with muqss.

    I am usually running 4 vms (Windows and Linux) and every vm produces this kind of kernel warning:

    [ +0.000029] WARNING: CPU: 2 PID: 2655 at arch/x86/kvm/lapic.c:1468 kvm_lapic_expired_hv_timer+0xee/0x110 [kvm]
    [ +0.000002] Modules linked in: vhost_net vhost macvtap macvlan fuse ctr ccm xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat libcrc32c crc32c_generic nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic arc4 nls_iso8859_1 nls_cp437 vfat fat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp iwlmvm kvm_intel nouveau kvm mac80211 iTCO_wdt iTCO_vendor_support eeepc_wmi asus_wmi irqbypass sparse_keymap snd_hda_intel crct10dif_pclmul crc32_pclmul evdev crc32c_intel snd_hda_codec joydev input_leds mousedev ghash_clmulni_intel led_class snd_hwdep mac_hid aesni_intel
    [ +0.000062] snd_hda_core iwlwifi mxm_wmi aes_x86_64 crypto_simd snd_pcm ttm cryptd glue_helper snd_timer e1000e i2c_algo_bit snd cfg80211 intel_cstate psmouse intel_rapl_perf soundcore hci_uart ptp btbcm pcspkr i2c_i801 pps_core btqca btintel bluetooth mei_me mei battery shpchp rfkill acpi_als video tpm_tis kfifo_buf intel_lpss_acpi wmi tpm_tis_core i2c_hid tpm industrialio intel_lpss fjes acpi_pad button sch_fq_codel sg ip_tables x_tables ext4 crc16 jbd2 fscrypto mbcache hid_generic usbhid hid sd_mod serio_raw atkbd libps2 ahci libahci libata xhci_pci xhci_hcd scsi_mod usbcore usb_common i8042 serio nvidia_drm(PO) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm nvidia_uvm(PO) nvidia_modeset(PO) nvidia(PO)
    [ +0.000079] CPU: 2 PID: 2655 Comm: CPU 0/KVM Tainted: P W O 4.10.10-1-ck #1
    [ +0.000002] Hardware name: System manufacturer System Product Name/Z170-A, BIOS 3401 01/25/2017
    [ +0.000001] Call Trace:
    [ +0.000008] dump_stack+0x76/0xa0
    [ +0.000005] __warn+0xda/0x100
    [ +0.000005] warn_slowpath_null+0x30/0x40
    [ +0.000019] kvm_lapic_expired_hv_timer+0xee/0x110 [kvm]
    [ +0.000006] handle_preemption_timer+0x21/0x30 [kvm_intel]
    [ +0.000006] vmx_handle_exit+0x169/0x1480 [kvm_intel]
    [ +0.000005] ? clear_atomic_switch_msr+0x15a/0x180 [kvm_intel]
    [ +0.000005] ? atomic_switch_perf_msrs+0x7e/0xb0 [kvm_intel]
    [ +0.000022] kvm_arch_vcpu_ioctl_run+0x880/0x1690 [kvm]
    [ +0.000005] ? _copy_to_user+0x67/0x80
    [ +0.000013] kvm_vcpu_ioctl+0x348/0x640 [kvm]
    [ +0.000004] do_vfs_ioctl+0xb2/0x600
    [ +0.000005] ? __fget+0x8a/0xc0
    [ +0.000002] SyS_ioctl+0x88/0xa0
    [ +0.000006] entry_SYSCALL_64_fastpath+0x1a/0xa9
    [ +0.000002] RIP: 0033:0x7f45c136e0d7
    [ +0.000002] RSP: 002b:00007f45b2efb8a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
    [ +0.000004] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f45c136e0d7
    [ +0.000002] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000012
    [ +0.000001] RBP: 00007f45b374f980 R08: 0000563b84228b90 R09: 00000000000000ff
    [ +0.000002] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
    [ +0.000002] R13: 00007f45c85d9000 R14: 0000000000000000 R15: 00007f45b374f980
    [ +0.000003] ---[ end trace f36b1d45e660e691 ]---

    I've also reported this to the linux kernel bugzilla but so far nothing.
    Strangely I am not getting these errors with the a vanilla kernel, so I assume that this is somethig with the muqss scheduler.

    I remember you did some changes to the high-resolution time and changed some stuff all over the linux kernel, could this also affected this?

    Peet

    ReplyDelete
  21. information from kdb

    -----summary-----
    sysname Linux
    release 4.10.11-ck1-otakux-2
    version #6 SMP PREEMPT Wed Apr 19 20:38:49 CST 2017
    machine x86_64
    nodename otakux-VirtualBox
    domainname (none)
    ccversion CCVERSION
    date 2017-04-20 09:15:33 tz_minuteswest 0
    uptime 00:04
    load avg 1.14 0.78 0.32

    MemTotal: 2045916 kB
    MemFree: 968412 kB
    Buffers: 28260 kB

    -----panic-----
    usercopy: kernel memory overwrite attempt detected to ffff8873f7093bb0 (kmalloc-8) (128 bytes)

    Entering kdb (current=0xffff8873f87f0000, pid 1684) on processor 1 Oops: (null)
    due to oops @ 0xffffffff9222fafe
    CPU: 1 PID: 1684 Comm: wineserver Tainted: G W 4.10.11-ck1-otakux-2 #6
    Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
    task: ffff8873f87f0000 task.stack: ffffa712c149c000
    RIP: 0010:__check_object_size+0x6e/0x1e3
    RSP: 0018:ffffa712c149fee0 EFLAGS: 00010282
    RAX: 000000000000005e RBX: ffff8873f7093bb0 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: ffff8873ffd0dbc8 RDI: ffff8873ffd0dbc8
    RBP: ffffa712c149ff00 R08: 0000000000000001 R09: 00000000000001fa
    R10: 0000000000000008 R11: ffffffff9323f98d R12: 0000000000000080
    R13: 0000000000000000 R14: ffff8873f7093c30 R15: ffff8873f7093bb0
    FS: 00007f424f661700(0000) GS:ffff8873ffd00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fe923f0e640 CR3: 0000000077ad9000 CR4: 00000000000406e0
    Call Trace:
    SyS_sched_setaffinity+0x6b/0x100
    entry_SYSCALL_64_fastpath+0x1e/0xad
    RIP: 0033:0x7f424edd5d7c
    RSP: 002b:00007fffb2ce8d78 EFLAGS: 00000206 ORIG_RAX: 00000000000000cb
    RAX: ffffffffffffffda RBX: 0000000000001b4a RCX: 00007f424edd5d7c
    RDX: 00007fffb2ce8d80 RSI: 0000000000000080 RDI: 0000000000000696
    RBP: 00000000006c7980 R08: 0000000000000696 R09: 00007fffb2ce8f30
    R10: 00000000000002f9 R11: 0000000000000206 R12: 00000000006c94e0
    R13: 00007fffb2ce8e70 R14: 00000000006c4cd0 R15: 0000000000000148
    Code: 48 0f 44 d1 48 c7 c6 26 f4 c9 92 48 c7 c1 07 9d ca 92 48 0f 45 f1 4d 89 e1 49 89 c0 48 89 d9 48 c7 c7 90 63 ca 92 e8 e8 72 f6 ff <0f> 0b 48

    ReplyDelete
  22. It seems that the MuQSS scheduler is causing animation lags with gnome 3.24.1. Can someone reproduce this? (By moving windows or by hovering over the left navigation bar in nautilus.)

    ReplyDelete
    Replies
    1. Have you already tried to compare your same setup with Alfred Chen's VRQ patch applied instead of MuQSS? I don't want to advertise it, but it may be worth a try. For my system Alfred's patch results in much more responsiveness at all without negative effects. No gaming on my machine tested.
      http://cchalpha.blogspot.de/2017/04/vrq-095-release.html

      BR, Manuel Krause

      Delete
    2. I might give this a try. I am wondering though, if you have tested any work intense stuff like compiling large projects like llvm/clang or chromium while having multiple virtual machines running?

      Delete
    3. No, I don't have these kinds of workload on here, having no need for this. Though, kernel compilation, severe swapping and additional I/O are usual on here. BTW, I also use the most recent BFQ I/O scheduler.

      Let us know after your VRQ trial.
      BR, Manuel Krause

      Delete
  23. @Con:
    The BFQ I/O scheduler has recently been updated into a stable release, v8r10. Maybe it's time to pick this up into your -ck patchset.

    BR, Manuel Krause

    ReplyDelete
  24. Every boot using reiserfs gives the following WARN:


    [ 7.106973] ------------[ cut here ]------------
    [ 7.107654] WARNING: CPU: 0 PID: 30 at fs/quota/dquot.c:619 dquot_writeback_dquots+0x248/0x250
    [ 7.108356] Modules linked in: nls_iso8859_1 nls_cp437 snd_hda_codec_hdmi iTCO_wdt iTCO_vendor_support acer_wmi sparse_keymap coretemp hwmon joydev intel_rapl x86_pkg_temp_thermal intel_powerclamp pcspkr snd_hda_codec_realtek psmouse snd_hda_codec_generic efi_pstore i915 ath9k ath9k_common ath9k_hw input_leds ath snd_hda_intel efivars mac80211 drm_kms_helper snd_hda_codec cfg80211 snd_hda_core atl1c led_class snd_hwdep nvidiafb snd_pcm drm vgastate fb_ddc i2c_i801 lpc_ich intel_gtt snd_timer syscopyarea sysfillrect sysimgblt mei_me fb_sys_fops mei i2c_algo_bit shpchp acpi_cpufreq tpm_tis tpm_tis_core tpm thermal wmi video button evdev mac_hid sch_fq_codel uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core vboxnetflt(O) videodev vboxnetadp(O) pci_stub media vboxpci(O) vboxdrv(O)
    [ 7.112231] ath3k btusb btrtl btbcm btintel bluetooth rfkill loop usbip_host usbip_core sg ip_tables x_tables hid_generic usbhid hid sr_mod cdrom sd_mod serio_raw atkbd libps2 ehci_pci xhci_pci xhci_hcd ehci_hcd ahci libahci libata scsi_mod usbcore usb_common i8042 serio raid1 raid0 dm_mod md_mod
    [ 7.114406] CPU: 0 PID: 30 Comm: kworker/0:1 Tainted: G O 4.10.8-1-ck1-ck #1
    [ 7.115804] Hardware name: Acer Aspire V3-771/VA70_HC, BIOS V2.16 01/14/2013
    [ 7.117213] Workqueue: events_long flush_old_commits
    [ 7.118632] Call Trace:
    [ 7.120042] ? dump_stack+0x5c/0x7a
    [ 7.122146] ? __warn+0xb4/0xd0
    [ 7.124214] ? dquot_writeback_dquots+0x248/0x250
    [ 7.126422] ? reiserfs_sync_fs+0x12/0x70
    [ 7.127951] ? finish_task_switch+0x7f/0x390
    [ 7.129203] ? flush_old_commits+0x30/0x50
    [ 7.130473] ? process_one_work+0x1b1/0x3a0
    [ 7.131714] ? worker_thread+0x42/0x4c0
    [ 7.132952] ? kthread+0xea/0x120
    [ 7.134202] ? process_one_work+0x3a0/0x3a0
    [ 7.135432] ? kthread_create_on_node+0x40/0x40
    [ 7.136663] ? ret_from_fork+0x26/0x40
    [ 7.137996] ---[ end trace 8c87d43bebda3f80 ]---

    ReplyDelete
  25. Could you give a hint on how to do that?
    It looks like threadirq is built unconditionally in 4.10 and I don't have threadirq as boot parameter.

    ReplyDelete
    Replies
    1. It's actually a unique config option in -ck only:
      Symbol: FORCE_IRQ_THREADING [=y]
      Type : boolean
      Prompt: Make IRQ threading compulsory
      Location:
      -> General setup
      -> IRQ subsystem

      Delete
    2. Thanks. I've checked when this option is off.
      Unfortunately, it freezes anyway. 4.8-ck1 is rock-solid though.

      Delete
  26. I think my issues with wine (which I have narrowed down to mostly wineserver) might be a priority inversion issue. Applications zombify when audio is out of sync and I'm assuming they deadlock when it doesn't render something in time.

    osu! with SCHED_BATCH wineserver will lockup the system when running SCHED_IDLEPRIO make -j8, given some time. SCHED_BATCH nice 19 wineserver delays the lockup much longer under the same stress, but will still occasionally zombify it. I have also tried this with Zero Escape The Nonary Games and reached similar results. Apart from this test case, wineserver is relatively stable with these policies under moderate stress.

    What I discovered along the way was that when compiling DKMS modules, CFS would sometimes terminate it with SIGPIPE during context switch. This occurs more frequently with linux-zen and linux-rt-bfq when BFQ is enabled. I have not seen this happen once on the ck-patchset and my test kernels with MuQSS in the past 3 months.

    ReplyDelete
  27. This comment has been removed by the author.

    ReplyDelete
  28. Is the 4.11 resync in the works ?

    ReplyDelete
  29. Con, could you try pushing MuQSS to mainline again https://lkml.org/? Maybe Linus and Scheduler maintainers changed their past views and might reconsider the pull.

    ReplyDelete
    Replies
    1. I don't have the time, inclination, intestinal fortitude nor psychological disturbance required for attempting something so futile. Linus' position against multiple CPU schedulers in the kernel has been hard line for over a decade. Additionally a patch this size maintained in mainline requires a full time job to respond to issues and maintain. I spend a few days every few months on this patch and it's fun; why would I want to make it torture?

      Delete
  30. Using mqss I get lags playing cpu intensive winegames like with CSMT (command stream) like Wow.
    I get lags that are not present with cfs on the stock arch kernel.
    Using renice helps however.
    Im using yield type 0.
    The lags especially occur when the sence changes.

    ReplyDelete