Dienstag, 29. Mai 2012

Search Driven Applications with Office 365 / SharePoint Online - Part I

Part I: Common stuff about Search Driven Applications and first solution proposal with Office 365 / SharePoint Online.

I wrote about building Search Driven Application with SharePoint 2010 some stuff and also some of my sessions during conferences were around this. For example here are some links:

Article in SharePoint Magazine:

Video from last year Collaboration Days: (http://www.collaborationdays.ch )

You can build Search Driven Applications in several different ways and based on different techniques. One of the easiest is using the Query Object Model from the SharePoint Search to create “Fixed Keyword Queries”. For example this query shows all SharePoint sites which you have access to:
contentclass:STS_Site contentclass:STS_Web contentClass:sts_listitem_850
For more Details have a look at the BlogPost from Laura Rogers: LINK

But the “magic” within Search Driven Solution is aggregating some similar crawled properties together in one managed property. See schema diagram:
Example: you have two lists in your SharePoint farm which both contains information about customers. Because of some historical reasons, maybe migration from SharePoint 2007 or merging from other systems, these are both independent and different fields in different SharePoint List. Merging this two Crawled Property together in one Managed Property called for example “cKunde” allowed you querying both fields with one call (for example: cKunde:AOL - you will get results from both list). Pooling Crawled Properties in Managed Properties as shown in the next picture is one of the technical basics for Search Driven Application:

Pooling properties is no problem in a SharePoint on premise installation. But with Office 365 / SharePoint Online we ran in some problems. The main problem is that we cannot configure the Search Service Application in Office 365 / SharePoint Online. So we are not able to do the merge from Crawled to Managed Properties, and without Managed Properties we are not able to query the content for specific issues. We can of cause use the given default Managed Properties like “Author, Filetype, Name etc.” and build Search Driven Solutions based on them. The constrain here is that we cannot edit them free as we need. The Managed Property “Filetype” for example is filed by the system with the given value. So we have to use some “tricks” to build Search Driven Application with Office 365 / SharePoint Online.

My solution for this problem is using the tagging feature in SharePoint 2010. Within this feature we are able to tag any content with special and individual information that can later be used to aggregate it under this scope or in the mix with other tags / scopes. Using the feature: “Metadata Publishing -> Save metadata on this list as social tags” allowed us to force users to fill in metadata in keyword fields or set default values which will then be automatically published as a social tag.

Based on this there are several ways to build Search Driven Solutions. During this Blogpost series I will come closer to two of them.
Let’s have a look at the first easier and simple one:
There is a Crawled Property called SocialTagId that can now be used to query for a special tag and so it can be used to build a search driven webpart.
Example search query: SocialTagId:"6a…b-754b-480e-86e8-5a…ae"

This Managed Property can also be used with standard search syntax for building AND / OR queries and be mixed with other search terms.

Example: SocialTagId:"fb2d872a-8750-4913-bc23-b0b32252b489" OR SocialTagId:"25dee478-f426-4dc1-9d25-340e9ecce093" – shows any content which is tagged with one of the given tags behind this SocialTagID´s

Some points to attend:
-          The result is just the same you get when you navigate to the Tag Profile page for a special tag.
-          Benefit is that you can use the resultset coming from the search in a separate webpart und customize the style etc. in an easy way
-          We are not able to do a  fuzzy search within this
-          You have to find out the SocialTagId which depends to your tag. This is not really a problem. To do these navigate to the Tag Profile, select “To find content related to '%tag%' in search, please click here.” The result is a search with the Managed Property “SocialTagId” and the given ID for the selected tag:

To Customize those Search Result using SharePoint Designer and XSLT follow the steps described by Laura Rogers: LINK
Webcast with hands on system demos:
Summary: we can build search driven solutions based on the tagging feature in Office 365 / SharePoint Online. Within advanced Document Library and List features like “Metadata Publishing” we can increase the usability and the user experience for adding this kind of information.
To aggregate any content tagged with special tags we can use the Managed Property SocialTagId.
Next post I will show how we can aggregate content based on any different similar tags.

Montag, 28. Mai 2012

Eigene Sortierreihenfolge in der SharePoint Suche

Custom sort option in SharePoint Search Results

Der Post basiert auf einer Frage aus der Community. Ich fand das wäre auch einen Blog Post wert.

Die Frage lautet: Ein Kunde hat folgende Anforderung: das Core-Results WebPart der SharePoint 2010 Suche kann nach Relevanz und Datum sortieren. Die Idee des Kunden ist nun, selbst ein Core Results WebPart zu schreiben um nach weiteren Eigenschaften die Sortierung selbst zu machen. – ist das eine gute Idee? Was ist dabei zu beachten?

Das ist eine gute und viel diskutierte Frage. Es gibt bereits einige Ansätze dazu in Netz, z.B. hier:

oder dieses Webpart:

alles in allem funktioniert das also schon. Mit jQuery kann man da auch handanlegen. Das Problem bei all diesen Lösungen ist, dass die Sortierung im Client gemacht wird. Das kann, je nach Datenmenge, zu einer sehr schlechten Performance führen und belastet eben den Client.

Wenn man solch eine Lösung als ein SharePoint Feature realisiert, dass dann auf dem AppServer läuft, hat man die Last vom Client zumindest schon mal auf einen skalierbaren Server verlegt. Der Punkt ist, dass das Resultset in beiden Fällen erst mal vom SQL Server geladen werden muss, dann umsortiert wird, und dann angezeigt wird. Die Lösung mit einem Feature ist natürlich schon mal besser zu skalieren als eine Lösung im Client.

Die Tatsache, warum die Suche nur die beiden Sortieroptionen „Relevanz“ und „Datum“ anbietet hat natürlich ihren  Grund. Die StoredProc proc_MSS_GetMultipleResults liefert das ganze SQL-seitig nach Relevanz sortiert (Details siehe hier: LINK). Eine nachträgliche Sortierung nach Datum ist einfach zu machen und kostet nicht viel Rechenzeit. Alles andere kann beliebig ausarten, vor allem bei größeren Datenmengen.

Ich plädiere dafür die Sortierung im SQL Server zu machen. Der kann das am besten, und um Längen besser und performanter als .net. Das ist recht einfach zu machen wenn das Property nachdem sortiert werden soll eines der folgenden ist: Rank, Title, Author, Size, Path, Write, HitHighlightedSummary, HitHighlightedProperties. Diese kommen im Result vom SQL Server standardmäßig mit. Der Call der an den SQL Server geht sieht in etwa so aus:

<QueryText language="en-US" type="MSSQLFT">Select PopularSocialTags,Rank, Title, Author, Size, Path, Write, HitHighlightedSummary, HitHighlightedProperties FROM Scope() WHERE FREETEXT(DefaultProperties, '%searchTerm%') ORDER BY "Rank" DESC</QueryText>

ORDER BY ist hier also einfach anzupassen. Wenn es eigene Properties sein sollen wirds komplizierter. Eine gute Möglichkeit damit zu experimentieren ist das Tool: http://fastforsharepoint.codeplex.com/ (heißt zwar FAST, geht aber auch für die normale Suche). Dort kannst das XML, das an den Webservice der Suche geschickt wird, bearbeitet werden und man kann damit experimentieren…

…Der offizielle Weg wäre FAST zu nehmen und ein eignes Rankingprofil zu erstellen.

Montag, 7. Mai 2012

SharePoint Search – a look behind the scene Part IV

Part IV: Last part with a summary and some further issues

In the last parts of the series “SharePoint Search – a look behind the scene” (Part I , Part II , Part III) we have seen how SharePoint interacts with the SQL Server and which data is stored in which SQL Server Database belonging to the Search Service.

In Part III I show where custom managed properties belonging to an object are stored and how SharePoint handles this correlation. As one of the missing parts now let´s have a look at social Tags. The social Tags are stored in the UserProfile and not within the content. So for me it was interesting to see, that in the Search database the social Tags information is stored together with the content information in one table. The Table is the MssDocResults table in the Search_Service_Application_PropertyStoreDB. Folowing link shows a description of the table. http://msdn.microsoft.com/en-us/library/dd773971(v=office.12).aspx  The column PopularSocialTags contains the Tags given from the users.

Following example shows a T-SQL Search for the tag “I like it”:

SELECT * FROM [Search_Service_Application_PropertyStoreDB].[dbo].[MSSDocResults]

WHERE[PopularSocialTags] like '%I like it%'

So what we also see here is that this table is just “de-normalized”. For example the tag “SQL Server” and not only its ID / key is stored in the table MSSDocResults.

This is because of performance issues. Search is read optimized and it’s faster to deliver the data / relation from content and social Tags without doing joins before.
The keyword “Performance” brings me to the next point. During this “look behind” series we see, and that’s also the information SQL reports give us, that some tables are more frequently used and have a bigger impact in the context of search performance.
There is a really good article on which index is frequently used on MSDN. Most frequently used tables by search queries are:
-          dbo.MSSDocProp
-          dbo.MSSDocResults
-          dbo.MSSOrdinal
in the Search_Service_Application_PropertyStoreDB.
So if performance is a bottleneg it can be helpful placing this tables / index on a separate filegroup on a fast disksubsystem.
Another interesting part is the security. In case of a high sensitive environment we have to see clear which data is placed in which database and how it is protected. Communication between SharePoint Application Server and the SQL Server SharePoint Search databases is not really critical. The search requests are compiles in the BLOB Data as shown in Part I. But a maybe critical point is, that the data coming from external systems via BCS are placed clear text in the Search_Service_Application_PropertyStoreDB / table: dbo.MSSDocProps

this can be an issue. Details see Part III.

See the complete post inc. the hands-on lab as webcaste here: