Skip to content

Conversation

@Liangxijun-1001
Copy link
Contributor

My env:AMD RX 7600
I debugged the MIOpen source code for a specific reason. During the model compilation phase, the Python side calls the conv2d.setup interface, invoking MIOpen's findConvForwardAlgorithm interface to find the appropriate algorithm. Subsequently, the corresponding <layer-Problem, Algorithm> pair is registered within the current invokers. Then, without terminating the current process, during the inference stage, the invoker can identify the Algorithm for the corresponding layer-problem and perform direct inference.

However, if the current process is exited and the pre-compiled model is executed without the prior invocation of findConvForwardAlgorithm during the compilation phase, the corresponding <layer-Problem, Algorithm> pair won't be registered within the invokers. As a result, the inference stage will report an error stating "MIOpen Error: No invoker was registered for convolution forward."

Based on the distinction between MIOpen and cuDNN invocation provided by the MIOpen official documentation, the typical sequence for calling Convolution APIs in MIOpen is as follows:

miopenConvolution*GetWorkSpaceSize(): This function returns the workspace size required by the Find() operation.

miopenFindConvolution*Algorithm(): This function returns performance information about various algorithms.

miopenConvolution*(): Actual convolution operation.

The official documentation emphasizes that calling miopenFindConvolution*Algorithm() is mandatory before using any Convolution API.

Additionally, according to the documentation found at https://rocm.docs.amd.com/projects/MIOpen/en/latest/convolution.html#miopenfindconvolutionforwardalgorithm, the last parameter of the miopenFindConvolutionForwardAlgorithm interface, exhaustiveSearch, should be set to 1 (true):

If exhaustiveSearch == 0, MIOpen will seek the first kernel with a configuration match. If no configuration match is found, a default configuration will be returned.

If exhaustiveSearch == 1, MIOpen will search for the best kernel for the provided configuration. If a match is not found, an exhaustive search is performed by running individual algorithms.

For further details refer to this link: https://rocmdocs.amd.com/projects/MIOpen/en/latest/MIOpen_Porting_Guide.html

cc: @tqchen @masahi @Lunderberg

@Liangxijun-1001 Liangxijun-1001 force-pushed the main_path branch 4 times, most recently from 7ae6725 to 3a900bb Compare December 2, 2023 03:16
@Liangxijun-1001
Copy link
Contributor Author

Liangxijun-1001 commented Dec 2, 2023

cc: @jinhongyii @junrushao

@tqchen
Copy link
Member

tqchen commented Dec 2, 2023

Thakns @Liangxijun-1001 . I think in this case, maybe we should ignore the fwd_algo in entry(decided in compile time) and then use the algorithm returned by the miopenFindConvolutionForwardAlgorithm

entry_ptr->handle, entry_ptr->conv_entry.input_desc, x->data,
entry_ptr->conv_entry.filter_desc, w->data, entry_ptr->conv_entry.conv_desc,
entry_ptr->conv_entry.output_desc, y->data, request_algo_count, &returned_algo_count,
perfs, workspace, workspace_size, exhaustive_search));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider reset entry_ptr->fwd_algo to simply the best returned algo

@Liangxijun-1001
Copy link
Contributor Author

Liangxijun-1001 commented Dec 4, 2023

Thakns @Liangxijun-1001 . I think in this case, maybe we should ignore the fwd_algo in entry(decided in compile time) and then use the algorithm returned by the miopenFindConvolutionForwardAlgorithm
My MIOpen tag:v2.18
Currently, I am following the official invocation steps! In the inference stage, it is necessary to call the 'find' function when executing each convolution operation(https://rocm.docs.amd.com/projects/MIOpen/en/latest/find_and_immediate.html + https://github.com/ROCmSoftwarePlatform/MIOpen/blob/develop/docs/MIOpen_Porting_Guide.md). This 'find' function will invoke the Find-DB operation to search the database. The following is the log after I have enabled the relevant macro switches.
export MIOPEN_ENABLE_LOGGING=0
export MIOPEN_LOG_LEVEL=7
releated logs:

input_shape: (1, 3, 224, 224)
precessing image: ../../dataset/images/test/kitten.jpg
origina image shape: (720, 720, 3)
resize image shape: (1, 3, 224, 224)
MIOpen(HIP): Info [get_device_name] Raw device name: gfx1102
MIOpen(HIP): Info [Handle] stream: 0, device_id: 0
MIOpen(HIP): Info [get_device_name] Raw device name: gfx1102
MIOpen(HIP): Info [SetStream] stream: 0, device_id: 0
MIOpen(HIP): Info [GetFindModeValueImpl] MIOPEN_FIND_MODE = DYNAMIC_HYBRID(5)
MIOpen(HIP): Info [AmdRocmMetadataVersionDetect] ROCm MD version AMDHSA_COv3, HIP version 5.7.23365, MIOpen version 2.20.0.f185a6464-dirty
MIOpen(HIP): Info2 [GetWorkSpaceSize]
MIOpen(HIP): Info [GetSolutions]
MIOpen(HIP): Info [IsNetworkedFilesystem] Filesystem type at '/home/liangnus/.config/miopen/' is: 0xef53 'EXT2/3/4_SUPER_MAGIC'
MIOpen(HIP): Info2 [GetLibPath] Lib Path: /opt/rocm-5.7.1/lib/libMIOpen.so.1.0.50701
MIOpen(HIP): Info2 [GetInstalledPathFile] inexact find database search
MIOpen(HIP): Info2 [GetInstalledPathFile] Iterating over find db directory /opt/rocm-5.7.1/share/miopen/db/
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx900_64.HIP.fdb
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx803_36.OpenCL.fdb
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx90a68.HIP.fdb
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx90a6e.HIP.fdb
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx803_36.HIP.fdb
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx906_64.OpenCL.fdb
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx906_60.OpenCL.fdb
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx906_60.HIP.fdb
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx803_64.HIP.fdb
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx1030_36.HIP.fdb
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx900_64.OpenCL.fdb
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx900_56.HIP.fdb
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx906_64.HIP.fdb
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx90878.HIP.fdb
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx900_56.OpenCL.fdb
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx803_64.OpenCL.fdb
MIOpen(HIP): Info [GetInstalledPathFile] Checking find db file: gfx90878.OpenCL.fdb
MIOpen(HIP): Info [Measure] ReadonlyRamDb::Prefetch time: 5e-05 ms
MIOpen(HIP): Info [Measure] RamDb::Prefetch time: 0.206036 ms
MIOpen(HIP): Info2 [ValidateUnsafe] DB file is newer than cache: 3876535210630, 3116245083235
MIOpen(HIP): Info2 [FindRecord] RamDb file is newer than cache, prefetching
MIOpen(HIP): Info [Measure] RamDb::Prefetch time: 0.211256 ms
MIOpen(HIP): Info2 [FindRecordUnsafe] Looking for key 3-224-224-7x7-64-112-112-1-3x3-2x2-1x1-0-NCHW-FP32-F in cache for file /home/liangnus/.config/miopen//gfx1102_16.HIP.2_20_0_f185a6464-dirty.ufdb.txt
MIOpen(HIP): Info2 [Measure] Db::FindRecord time: 0.286266 ms
MIOpen(HIP): Info [GetWorkSpaceSize] 7375872
MIOpen(HIP): Command [LogCmdFindConvolution] ./bin/MIOpenDriver conv -n 1 -c 3 -H 224 -W 224 -k 64 -y 7 -x 7 -p 3 -q 3 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -F 1 -t 1
MIOpen(HIP): Info [FindConvFwdAlgorithm] requestAlgoCount = 1, workspace = 7375872
MIOpen(HIP): Info [GetSolutions]
MIOpen(HIP): Info2 [ValidateUnsafe] DB file is newer than cache: 3876535210630, 3116245344375
MIOpen(HIP): Info2 [FindRecord] RamDb file is newer than cache, prefetching
MIOpen(HIP): Info [Measure] RamDb::Prefetch time: 0.214552 ms
MIOpen(HIP): Info2 [FindRecordUnsafe] Looking for key 3-224-224-7x7-64-112-112-1-3x3-2x2-1x1-0-NCHW-FP32-F in cache for file /home/liangnus/.config/miopen//gfx1102_16.HIP.2_20_0_f185a6464-dirty.ufdb.txt
MIOpen(HIP): Info2 [Measure] Db::FindRecord time: 0.262162 ms
@masahi can you review this PR?

…m platform: MIOpen Error: No invoker was registered for convolution forward.

Signed-off-by: Liangxijun-1001 <[email protected]>
@masahi masahi merged commit 37329bf into apache:main Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants