(...which data is stored in context of crawled and managed properties in the SharePoint Search databases?)
In the second post of this series we see that the metadata shown in the result-set is stored in the SQL Server, even the data is coming from an extern BCS source. The given metadata like “name”, “description” etc. is stored in the PropertyBlob field explained in the first post. But this Blob only contains data which are part of the managed property “HitHighlightedSummary”. What happened now if we add a crawled property coming from a BCS source to a managed property?
We have an External Content Source called “LOB2”. This content source is connected to an SQL Server database called “MiCLAS_TEST”.
The External Content Source contains the table called “cis.Vorgang_z”
If we now search for example the term “330114” witch is a value from the column “VorNummer” we get that result in SharePoint Search:
Catching the call with SQL Server Profiler (as described in Part II) and query it in SQL Server Management Studio it looks like this:
The PropertyBlob in this case contains the following data as varbinary:
If we know add crawled propertys to a managed propertys the PropertyBlob chaned1. So were are this informations stored used to generate the updated Property Blob? (The update happens while fullcrawling the content source) The answer is the table “dbo.MSSDocProps“ in the „Search_Service_Application_PropertyStoreDB“.
Lets have a look using the DocId 2056 from the call above:
(for a better undestrand I alsoadd the “FriendlyName” column from the table dbo.MSSManagedProperties to the query )
FROM dbo.MSSDocProps v1, Search_Service_Application_DB_dd13ba19a7bb4ffaafcc3e626e73c949.dbo.MSSManagedProperties v2
WHERE DocId = 2056
and v1.PID = v2.PID
Now I map some crawled property to managed properties:
After a full crawled I execute the query again:
In case of the “LOB2Date” and the “LOB2Cur” values you can see that the data is in the IIVal column. The “LOB2Bez” value is clear text in the strVal2 column. LOB2Date is a cryptic datetime value based on the datatype DateTime Structure. LOB2Cur is a decimal value.
1The updated PropertyBlob now contains the following:
So we see all the content contained in managed metadata is stored in the „Search_Service_Application_PropertyStoreDB“ database, even the data is coming from an extern BCS source. This is intresting for some security issues(there will be a separate post about this soon)and also for SQL Server maintainence and index defragmentaion. There is a very good article about this availibel on blogs.msdn: LINK
See the complete post inc. the hands-on lab as webcaste here:
FROM dbo.MSSDocResults AS P WITH(NOLOCK), DocIds AS T WHERE P.DocId = T.DocId OPTION (MAXDOP 1) ',0x00001592000000000000...700000031
So let’s see what exactly happens by disassemble that query call.
We can see that the Stored Procedure dbo.proc_MSS_GetMultipleResults is called. But we want to go just another step deeper and find out what’s behind this call. (The Stored Procedure dbo.proc_MSS_GetMultipleResults will be part of one of the next posts.)
First of all the DATALENGTHof the generated BLOB witch is stored in @joinData (details for the BLOB data can be found here: LINK) is used to set the variable @joinRows:
1.SET @joinRows =DATALENGTH(@joinData)/ 8
Let’s see what the result is using this query:
The result is “50”
In the next step a temporary result-set called “DocIds” is created using the “WITH” SQL statement.
2.WITH DocIds(DocId, Value)AS (…
The call generates a result-set looking like this:
In the first column we see the DocIDs we will need in the next step. But let’s have a look on how this result is generated. The point in the query is this one
FROM dbo.MSSOrdinal AS ord WITH(NOLOCK)WHERE ord.n <= @joinRows
Using the CAST(SUBSTRING(@joinData,((ord.n-1)*8)+ 1, 4)ASINT)on the BLOB data stored in the @joinData will fillter out a list of item identifiers and their rank contained in an Id value pair described also here LINK. This is the only magic creating the result-set showen in step 2.
Next is an easy join using the tempory result-set “DocIds” and the contend of the table “dbo.MSSDocResults”. The join is done with the part "WHERE P.DocId = T.DocId" showen in the query below:
DocIds AS T WHERE P.DocId = T.DocId OPTION (MAXDOP 1)
The result is:
So we see that the results given back from SQL Server contains all the data needed even the data is coming from an extern BCS source. The given metadata like “name”, “description” etc. is stored in the PropertyBlog field explained in the first post.
If we now call the ProfilePage of the BCS source all data fields witch are configured are needed. This result in call against the data source defined in the BCS model. In my case this is also an SQL Server call because my External Data Source is a SQL Server: