Complicated search Topic is solved

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
Albireo
Posts: 1747
Joined: 16 Oct 2013, 13:53

Complicated search

30 Oct 2017, 09:29

I begin to try describing the prerequisites, before I create a more concrete example.
(It's not quite easy to explain - I hope you understand my wish)

Description .:
I have a CSV file with articles (with about 30,000 products and 70 fields, from a database).
This file contains, among other fields, the following fields:
Our article number; BarCode1; BarCode2; BarCode3; BarCode4; Supplier's article number.
- Our article number .: is an unique field. (But can be the same in the other fields).
- BarCode1-4 .: are unique fields (But can sometime be found in "our article number" or more seldom in"Supplier's article number".)
- Supplier's article number .: Not a unique field.

I want to create two different types of search.
Search one:
ArtSearch is the variable I want to find in the CSV-file.
Is the ArtSearch found in the field Our article number?.
- Yes, the ArtSearch is found .: The contents, of all other fields, are saved for this product.
- No, the ArtSearch is NOT found ..:
- - A new search must be done in the field Barcode1-4. Is the ArtSearch, now found?
- - - Yes, the ArtSearch is found .: The contents, of all other fields, are saved for this product.
- - - No, the ArtSearch is NOT found. (handle this case)

Search two:
Is the ArtSearch found in any of the fields Our article number; BarCode1-4 or Supplier's article number. ?.
- Yes, Show all hits, in order to manually select the right product later.
(the variabel ArtSearch can be found one time in Our article number and one time in Barcode1-4 and many times in the field Supplier's article number.)
- No, the ArtSearch is NOT found. (handle this case)

Thoughts .:
Solving the problem of Pseudo-Arrays and Loops for each field does not feel like an optimal solution.
Each search could mean several 100,000 comparisons. Making a compilation of 1000-1500 products would take a very long time.
Creating an "associative array" of the article registry feels like a better solution, but how can I do searches in this?
Especially in the field Supplier's article number.[/ C] which is not a unique field, where multiple products may have the same content.
SirRFI
Posts: 404
Joined: 25 Nov 2015, 16:52

Re: Complicated search

30 Oct 2017, 13:40

Isn't this what for SQL was made for? To deal with database queries effectively?

Actually, some kind of reversed associative array may work. I have an idea, but the biggest deal might be about Supplier's article number field. Provide few rows/entries as sample input data.
Use

Code: Select all

[/c] forum tag to share your code.
Click on [b]✔[/b] ([b][i]Accept this answer[/i][/b]) on top-right part of the post if it has answered your question / solved your problem.
Albireo
Posts: 1747
Joined: 16 Oct 2013, 13:53

Re: Complicated search

31 Oct 2017, 05:12

SirRFI wrote:... Actually, some kind of reversed associative array may work. I have an idea, but the biggest deal might be about Supplier's article number field. Provide few rows/entries as sample input data.
The database is not an SQL database. To get information from the database, you can use ODBC. But I want to use an exported CSV file, to handle the data.
I know that the Supplier's article number is the most difficult to handle, because this article number can be found in every fields.
But this search is not so common, so the focus should primarily be on "Search one".
(It can be noted that the fields ArtNo and Barcode1 always have a value)
Have attempted to create test values for these fields. (maybe I need to describe more possibilities, to make the test even better.)

Code: Select all

TestValue =
(LTrim Join
	ArtNo;Barcode1;Barcode2;Barcode3;Barcode4;SuppArtNo`n
	0123;7340073855576;630026-111-09;;;0123`n
	000030-333-00;7350030498777;;;;567`n
	GT43179;7392248431795;;;;43179`n
	00014-16;7391487014448;;;;00014-16`n
	AI00014;7393533078640;610025-444-80;;;00014.16`n
	022-98;7340015461087;7340015461100;7340015461216;7340015462459;567`n
)
SirRFI
Posts: 404
Joined: 25 Nov 2015, 16:52

Re: Complicated search

31 Oct 2017, 05:20

What about Supplier's article number containing multiple values?
Use

Code: Select all

[/c] forum tag to share your code.
Click on [b]✔[/b] ([b][i]Accept this answer[/i][/b]) on top-right part of the post if it has answered your question / solved your problem.
Albireo
Posts: 1747
Joined: 16 Oct 2013, 13:53

Re: Complicated search

31 Oct 2017, 05:29

Yes!
The products come from different suppliers.
When they using less than 5 digits / characters, it is not uncommon for multiple vendors to have the same article number.
SirRFI
Posts: 404
Joined: 25 Nov 2015, 16:52

Re: Complicated search

31 Oct 2017, 05:59

No, to my understanding it can contain multiple article IDs. How does it look like in such case? Are they separated by a coma or something?
Use

Code: Select all

[/c] forum tag to share your code.
Click on [b]✔[/b] ([b][i]Accept this answer[/i][/b]) on top-right part of the post if it has answered your question / solved your problem.
SirRFI
Posts: 404
Joined: 25 Nov 2015, 16:52

Re: Complicated search  Topic is solved

31 Oct 2017, 08:36

Code: Select all

ArtNo;Barcode1;Barcode2;Barcode3;Barcode4;SuppArtNo
0123;7340073855576;630026-111-09;;;0123
000030-333-00;7350030498777;;;;567
GT43179;7392248431795;;;;43179
00014-16;7391487014448;;;;00014-16
AI00014;7393533078640;610025-444-80;;;00014.16
022-98;7340015461087;7340015461100;7340015461216;7340015462459;567

Code: Select all

GLOBAL table_content := {}	; Must be global to be used in functions, unless You want to pass it as parameter every time
Loop, Read, sample_file.csv
{
	Line := A_Index
	If (Line == 1)
	{
		GLOBAL columns := StrSplit(A_LoopReadLine, ";")	; ArtNo;Barcode1;Barcode2;Barcode3;Barcode4;SuppArtNo
		For ID,column in columns
			table_content[column] := {}
	}
	Else
	{
		row := StrSplit(A_LoopReadLine, ";")
		For ID,column in columns
			table_content[column,row[ID]] := Line
	}
}



ArtSearch(query)
{
	result := ""
	For ID,column in columns
	{
		If (column != "SuppArtNo")
		{
			If (table_content[column,query])
			{
				result := ReadRow(table_content[column,query])
				MsgBox % "Found " query " in " column " (line: " table_content[column,query] "). Contains:`n" PrintRow(result)
				Break
			}
		}
	}
	If (!result)
		MsgBox % query " was not found."
}

ReadRow(line)
{
	FileReadLine, output, sample_file.csv, % line
	Return % StrSplit(output, ";")	; Returns array of the row's contents
}

PrintRow(line)
{
	output := ""
	For ID,content in line
		output .= columns[ID] ":`t" content "`n"
	Return % output
}



; TEST
ArtSearch("GT43179")	; ArtNo, line 4
ArtSearch("610025-444-80")	; BarCode2, line 6
ArtSearch("7340073855576")	; BarCode1, line 2

Code: Select all

Found GT43179 in ArtNo (line: 4). Contains:
ArtNo:	GT43179
Barcode1:	7392248431795
Barcode2:	
Barcode3:	
Barcode4:	
SuppArtNo:	43179

Found 610025-444-80 in Barcode2 (line: 6). Contains:
ArtNo:	AI00014
Barcode1:	7393533078640
Barcode2:	610025-444-80
Barcode3:	
Barcode4:	
SuppArtNo:	00014.16

Found 7340073855576 in Barcode1 (line: 2). Contains:
ArtNo:	0123
Barcode1:	7340073855576
Barcode2:	630026-111-09
Barcode3:	
Barcode4:	
SuppArtNo:	0123
You can split SuppArtNo into sub-array, and try searching there using similar method.

I call this reversed-array, because values are actually keys. It should be good for your scenario, as it does not compare anything, just check if a key exist for given column, and if it does - read contents from file at given line.
Use

Code: Select all

[/c] forum tag to share your code.
Click on [b]✔[/b] ([b][i]Accept this answer[/i][/b]) on top-right part of the post if it has answered your question / solved your problem.
Albireo
Posts: 1747
Joined: 16 Oct 2013, 13:53

Re: Complicated search

31 Oct 2017, 08:53

I try to explain again.
The focus is on six fields in the article registry. (see the description above for these)
Field1 .: "Our article number" (even called ArtNo)
This field have always a value, and this value is unique.

Field2 .: "Barcode1"
Field3 .: "Barcode2"
Field4 .: "Barcode3"
Field5 .: "Barcode4"
One of this fields (always Field2?) have always a value, and this value is unique.
But it may be the same value as in field "Field1".
For example .: If the value in Field2 is the same as the value in Field1 it is always on the same product.

Field6 .: "Supplier's article number" (even called SupArtNo)
This field may contain anything (and almost any characters). This is not a unique field, ie duplicates can occur in this field.
It means that different suppliers may have the same article number but on different products.
If SupArtNo is short from a supplier (less than 5 characters), it is quite often that duplicates is created in this field.
Sometimes, but not often, the ArtNo and BarCode and SupArtNo can have the same value on one product.

Hope I could further explain the fields, and the relationship between the fields better.
Albireo
Posts: 1747
Joined: 16 Oct 2013, 13:53

Re: Complicated search

31 Oct 2017, 08:59

Thanks! SirRFI
I want to analyse your solution.
just me
Posts: 9425
Joined: 02 Oct 2013, 08:51
Location: Germany

Re: Complicated search

31 Oct 2017, 09:17

Are you searching for exact matches?
Albireo
Posts: 1747
Joined: 16 Oct 2013, 13:53

Re: Complicated search

31 Oct 2017, 10:44

[y2be][/y2be]
just me wrote:Are you searching for exact matches?
Yes! (but not upper- or lowercase)
Albireo
Posts: 1747
Joined: 16 Oct 2013, 13:53

Re: Complicated search

14 Nov 2017, 15:54

What happens on these program rows?

Code: Select all

		row := StrSplit(A_LoopReadLine, ";")
		For ID,column in columns
			table_content[column,row[ID]] := Line
SirRFI
Posts: 404
Joined: 25 Nov 2015, 16:52

Re: Complicated search

14 Nov 2017, 18:58

Albireo wrote:What happens on these program rows?

Code: Select all

row := StrSplit(A_LoopReadLine, ";")
Each line of your data is read separately, therefore A_LoopReadLine contains entire line, like:

Code: Select all

0123;7340073855576;630026-111-09;;;0123
The line is split onto array using ; symbol. The array is saved to row variable. Result:

Code: Select all

["0123", "7340073855576", "630026-111-09", "", "", "0123"]

Code: Select all

For ID,column in columns
	table_content[column,row[ID]] := Line
columns array, which is created earlier in the code, contains column names - one per key.
In this part You loop through the array elements - ID is the key name (index), column is it's content (column's name).
Line variable contains current A_Index from higher-level loop.
As result, You write to table_content, where first key is column's name and second is row's array content from ID index. Example:

Code: Select all

table_content["ArtNo","0123"] := 1
table_content["Barcode1","7340073855576"] := 1
table_content["Barcode2","630026-111-09"] := 1
table_content["Barcode3",""] := 1
table_content["Barcode4",""] := 1
table_content["SuppArtNo","0123"] := 1
table_content["ArtNo","000030-333-00"] := 2
table_content["Barcode1","7350030498777"] := 2
; ...
Use

Code: Select all

[/c] forum tag to share your code.
Click on [b]✔[/b] ([b][i]Accept this answer[/i][/b]) on top-right part of the post if it has answered your question / solved your problem.
Albireo
Posts: 1747
Joined: 16 Oct 2013, 13:53

Re: Complicated search

08 Jan 2018, 07:37

Have try to understand and use the program above, for my real problem.
I like the solution (for search it's very fast), but some questions have encountered if I want to do other things.

Is it possible/realistically to make many arrays with the "Header" and the "ArtNo" like this .:

Code: Select all

table_content["0123","Barcode1"]: = "7340073855576"
table_content["0123","Barcode2"] := "630026-111-09"
table_content["0123","Barcode3"] := ""
table_content["0123","Barcode4"] := ""
table_content["0123","SuppArtNo"] := "0123"
table_content["000030-333-00","Barcode1"] := "7350030498777"
table_content["000030-333-00","Barcode2"] := ""
; ...
(the real file has as I have told before, 30,000 rows and 70 columns)
How much memory will a solution like this require? (30000x70= 2,1MB?)
Are there any limitations / disadvantages to create a solution like this?

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: Billykid, Google [Bot] and 217 guests