Script the disabling of fast_clz on ffmpeg windows |
||||||||||||
Issue descriptionSee https://chromium-review.googlesource.com/#/c/344671/ which did this manually, but that will be lost on the next ffmpeg roll. This bug tracks implementing an automated solution for use during ffmpeg rolling. Perhaps one of: 1) run ffmpeg ./configure for windows in a VM with the oldest CPU we support (credit: thakis@chromium.org) 2) figure out a --cpu to pass to ffmpeg ./configure for windows which won't munge a bunch of other bits (credit: dalecurtis@chromium.org) 3) add a new munge script specifically for ia32 builds: https://code.google.com/p/chromium/codesearch#chromium/src/third_party/ffmpeg/chromium/scripts/munge_config_optimizations.sh These scripts (we don't have any right now) are normally called in build_ffmpeg.py when generating the configuration options. The latter while slightly messier seems like it might be safer. (credit: dalecurtis@chromium.org) This bug doesn't absolutely block the next FFmpeg roll, but ideally it would be fixed as part of it. Otherwise, each roll will need manual intervention to disable fast_clz on windows.
,
May 16 2016
Note that https://chromium-review.googlesource.com/#/c/344671/ may actually fix this bug (following route #2 or #3). It's iterating in CR right now.
,
May 16 2016
Since configure probes the local system, it probably applies to non-Windows too -- but the intrinsic setup on non-Windows until very recently was so that you could only use e.g. SSE intrinsics if you compiled your code with a --target cpu that supports SSE. On Windows, it's historically possible to use e.g. SSE intrinsics in files where the compiler isn't allowed to use SSE for the code it creates, and so it's easier to (accidentally and intentionally) mix-and-match SSE and non-SSE code in one TU. So it's probably more pressing on Windows. https://chromium-review.googlesource.com/#/c/344671/ does (2) for now (but we can probably do better later).
,
May 16 2016
Ok. I'll keep this bug open for possibly implementing (3) (for ia32 and x64 windows/linux/mac/etc) in the future. Once https://chromium-review.googlesource.com/#/c/344671 lands with implementation of (2), this bug will no longer block bug 591845 .
,
May 16 2016
Sounds good. Please let me know if the --cpu flag causes any issues. Happy to help in that case.
,
May 18 2016
@#4: It landed and rolled (https://chromium.googlesource.com/chromium/src/+/476f6c9166db498c42887fb78229b6de54f6e148), so this bug no longer blocks bug 591845 . Keeping open for possibly implementing (3) in future.
,
Dec 13 2016
=> hubbe@ for potentially implementing (3) in the M-57 roll.
,
Dec 13 2016
=> liberato's doing the M-57 roll instead.
,
Oct 20 2017
,
Oct 23 2017
xhwang@ - Is this something you could fix during your M65 roll? (If not, please remark as blocking the next ffmpeg roll beyond yours).
,
Oct 23 2017
,
Oct 23 2017
Assign to sandersd@ since the ffmpeg rotation has shifted and now sandersd@ will work on the M65 roll.
,
Oct 23 2017
,
Oct 23 2017
,
Oct 23 2017
ffmpeg uses runtime cpu detection for most things, so we're mostly just worried about this one edge case with the MSVC specific intrinsic. For this case does anyone remember why we didn't just check defined(__BMI__) ?
,
Oct 24 2017
Seems no __BMI__ in MSVC, and even with clang -mavx2 doesn't seem to imply -mbmi; so that's a bit surprising. In any case, looking over the discussion on cfe-dev@ hans says that BMI isn't required for _tzcnt_*: http://lists.llvm.org/pipermail/cfe-dev/2016-October/051331.html ...but that still isn't fixed on trunk it seems: ../../third_party/ffmpeg/libavcodec/h2645_parse.c(202,17): error: implicit declaration of function '_tzcnt_u32' is invalid in C99 [-Werror,-Wimplicit-function-declaration] size -= ff_ctz(v) + 1; I think a simple patch to x86/intmath.h to only define these when !__clang__ would be upstreamable. Will send it and see.
,
Oct 24 2017
I don't remember details, but I think on our end we concluded that what ffmpeg is doing is really questionable and we shouldn't change the compiler. So if you're changing ffmpeg upstream, maybe you could make things just not behave weirdly. Hans, do you remember what that meant concretely?
,
Oct 24 2017
Happy to try if it's truly questionable, but we don't see any crashes in this code prior to your fix (and the number of opterons is non-zero), so I think hans@ is right: https://crash.corp.google.com/browse?q=product.name%20CONTAINS%20%27Chrome%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.file_path%20CONTAINS%20%27ffmpeg%2Flibavutil%2Fmathematics.c%27%20AND%20product.Version%20CONTAINS%20%2751.0.%27&sql_dialect=dremelsql&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D Either we don't have any Opteron users or it would seem either _tzcnt_* maps correctly on non-BMI hardware with MSVC.
,
Oct 24 2017
For historical context, hans@ originally landed http://llvm.org/viewvc/llvm-project?view=revision&revision=253358 based on this discussion on ffmpeg-devel: https://ffmpeg.org/pipermail/ffmpeg-devel/2015-November/183404.html -- but I guess it got reverted at some point. There was a later attempt by thakis@ as part of https://bugs.llvm.org/show_bug.cgi?id=30506, but it appears this never landed. Per the e-mail it seems pretty clear ffmpeg is okay with the "rep bsf" encoding tzcnt degrades to (and answers my question about why we didn't have crashes). Probably the best I can do from the ffmpeg side is to add a __clang__ #define if clang-cl isn't planning to match MSVC behavior here.
,
Oct 24 2017
Sent http://ffmpeg.org/pipermail/ffmpeg-devel/2017-October/218445.html upstream which I think is as much as I can get away with.
,
Oct 24 2017
Per the discussion on https://reviews.llvm.org/D26335#inline-227758 , iirc the argument went the rep bsf doesn't check for 0, so if you really want tzcnt you need to check for 0 yourself first, and at that point it really doesn't buy you anything to use the fallback encoding. So it's not clear that they get any perf wins from what they're currently doing (and I think the commit message that added that code was like "might as well..." and not anything rigorous)
,
Oct 25 2017
Upstream took my patch http://git.videolan.org/?p=ffmpeg.git;a=commit;h=50e30d9bb71e1dff27be16c264fac90e362b9896 so we can drop the cpu switch with the next ffmpeg roll.
,
Nov 17 2017
Fixed in latest roll. No more cpu=opteron and all builds are done with Clang, so we shouldn't get any bad flags again. |
||||||||||||
►
Sign in to add a comment |
||||||||||||
Comment 1 by wolenetz@chromium.org
, May 16 2016