Разделы презентаций


Search Algorithms and Data Structures

Содержание

About Me

Слайды и текст этой презентации

Слайд 1Search Algorithms and Data Structures
Mikhail Khludnev Khabarovsk'19

Search Algorithms and Data StructuresMikhail Khludnev														Khabarovsk'19

Слайд 2About Me

About Me

Слайд 3RDBMS is …
https://www.geeksforgeeks.org/database-file-indexing-b-tree-introduction/

RDBMS is …https://www.geeksforgeeks.org/database-file-indexing-b-tree-introduction/

Слайд 4Composite Index
CREATE INDEX class_pos_index ON users (class, position);

Composite Index CREATE INDEX class_pos_index ON users (class, position);

Слайд 5eComm Facets

eComm Facets

Слайд 6Inverted Index

Inverted Index

Слайд 7Lucene, Solr and Elasticsearch
http://www.supercoloring.com/coloring-pages/romulus-and-remus-with-the-she-wolf.

Lucene, Solr and Elasticsearchhttp://www.supercoloring.com/coloring-pages/romulus-and-remus-with-the-she-wolf.

Слайд 8The First Indices
Общественное достояние, https://commons.wikimedia.org/w/index.php?curid=433142
https://rbscp.lib.rochester.edu/489

The First IndicesОбщественное достояние, https://commons.wikimedia.org/w/index.php?curid=433142https://rbscp.lib.rochester.edu/489

Слайд 9Term Dictionary
rubens
rub*
[rome TO rustic]
*uber
*man*
By Claudio Rocchini - Own work, CC

BY 2.5, https://commons.wikimedia.org/w/index.php?curid=2118795

Term Dictionaryrubensrub*[rome TO rustic]*uber*man*By Claudio Rocchini - Own work, CC BY 2.5, https://commons.wikimedia.org/w/index.php?curid=2118795

Слайд 10Offsets to Postings List File
10: 8, 9, 10, 14, 18,

23, 24, 26, 31, 35; 8: 8, 11, 14, 18,

21, 23, 25, 27; 8: 4, 5, 6, 9, 13, 14, 18, 22; 8: 3, 4, 7, 9, 12, 13, 17, 20; 7: 5, 9, 12, 14, 19, 23, 28; 9: 0, 2, 5, 6, 11, 13, 17, 22, 27
Offsets to Postings List File10: 8, 9, 10, 14, 18, 23, 24, 26, 31, 35; 8: 8,

Слайд 11Postings Codecs
delta 8, 9, 10, 14, 18, 23

=> 1,1,1,4,4,5
vint

5 => 000001012 129 => 100000012 000000012
PFOR 0012 0012 0012 1002 1002 1012

Postings Codecsdelta  			8, 9, 10, 14, 18, 23  =>  1,1,1,4,4,5vint

Слайд 12Query Execution

Query Execution

Слайд 13Stored Field Retrieval
{"name": "first doc", "count": 43} {"name": "another doc",

"count":5} {"name": "the last doc", "count":3435} {"name": "the book", "count":-1334}



Stored Field Retrieval{

Слайд 14Indexing

Indexing

Слайд 15Index Document
curl -X PUT "localhost:9200/twitter/tweet/1" -H 'Content-Type: application/json' -d'
{

"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" :

"trying out Elasticsearch"
}
'

Index Documentcurl -X PUT

Слайд 16Mapping
curl -X PUT "localhost:9200/my_index" -H 'Content-Type: application/json' -d'{
"mappings": {

"properties": {
"title": { "type":

"text" },
"name": { "type": "text" },
"age": { "type": "integer" },
"created": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis" } } }}'

Mappingcurl -X PUT

Слайд 17Analysis
curl -X PUT "localhost:9200/my_index" -H 'Content-Type: application/json' -d'
{ "settings": {

"analysis": {
"analyzer": {

"my_custom_analyzer": { "type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase"
]
} } } }}
'
Analysiscurl -X PUT

Слайд 18In-memory Buffer
{
"user" : "kimchy",
"message" :

"trying out

Elasticsearch"
}

user

message

In-memory Buffer{

Слайд 19Segments
1,2,2,3
1
2,2,1

Segments1,2,2,312,2,1

Слайд 20_refresh
Elasticsearch: The De€nitive Guide
Copyright © 2015 Elasticsearch. All rights

reserved.

_refreshElasticsearch: The De€nitive Guide Copyright © 2015 Elasticsearch. All rights reserved.

Слайд 21_refresh vs _flush
Elasticsearch: The De€nitive Guide
Copyright © 2015 Elasticsearch.

All rights reserved.

_refresh vs _flushElasticsearch: The De€nitive Guide Copyright © 2015 Elasticsearch. All rights reserved.

Слайд 22Indexing Performance
_bulk
Threads
ETL is hard

Indexing Performance_bulkThreadsETL is hard

Слайд 23Searching

Searching

Слайд 24Result Page
{
"timed_out": false,
"took": 62,
"_shards":{

"total" : 10,
"successful"

: 1,
"skipped" : 0,
"failed" : 0
},
"hits":{
"total" : {
"value": 23539,
"relation": "eq"
},
"max_score": 1.3862944,

"hits" : [
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "0",
"_score": 1.3862944,
"_source" : {
"user" : "kimchy",
"date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch",
"likes": 0
}
}, {
"_index" : "twitter",
"_type" : "_doc",
"_id" : "242",
"_score": 0.72944,
"_source" : {
"user" : "foo bar",
"date" : "2009-11-15T14:12:12",
"message" : "searching Elasticsearch",
"likes": 0
}
}
] }}
Result Page{

Слайд 25Result Page Cropping
10: 8, 9, 10, 14, 18, 23, 24,

26, 31, 35; 8: 8, 11, 14, 18, 21, 23,

25, 27; 8: 4, 5, 6, 9, 13, 14, 18, 22; 8: 3, 4, 7, 9, 12, 13, 17, 20;


O(n log (p))

http://www.cse.hut.fi/en/research/SVG/TRAKLA2/tutorials/heap_tutorial/taulukkona.html

Result Page Cropping10: 8, 9, 10, 14, 18, 23, 24, 26, 31, 35; 8: 8, 11, 14,

Слайд 27https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html

Слайд 28Real Life Usage

Real Life Usage

Слайд 29Elastic Logstash Kibana (ELK) Stack
https://www.elastic.co/guide/en/logstash/current/deploying-and-scaling.html

Elastic Logstash Kibana (ELK) Stackhttps://www.elastic.co/guide/en/logstash/current/deploying-and-scaling.html

Слайд 30Scaling
https://www.digitalocean.com/community/tutorials/understanding-database-sharding

Scalinghttps://www.digitalocean.com/community/tutorials/understanding-database-sharding

Слайд 31Summary
Why search?
Inverted Index
Indexing
Searching
Scaling
ELK

SummaryWhy search?Inverted Index IndexingSearchingScalingELK

Слайд 32200 threads
https://blog.newrelic.com/engineering/expected-distributions-website-response-times/

200 threadshttps://blog.newrelic.com/engineering/expected-distributions-website-response-times/

Слайд 332,2,1
Index Hierarchy
2,2,1
2,2,1
Segment - inverted index

Lucene Index - chain

Shard
Elasticsearch Index
Index

pattern: access-log-*
access-log-2019-08-06
access-log-2019-08-07
access-log-2019-08-08
….

2,2,1Index Hierarchy2,2,12,2,1Segment - inverted indexLucene Index - chain ShardElasticsearch IndexIndex pattern: access-log-*access-log-2019-08-06access-log-2019-08-07access-log-2019-08-08….

Слайд 34An Index is …
an derived data structure

An Index is … an derived data structure

Слайд 36Term Frequency
red
red bar
red red red bar red

Term Frequencyredred barred red red bar red

Слайд 37Inverse Document Frequency
red bull
Chicago Bulls
Red Socks
Computing is too important to

be left to men
Karen Spärck-Jones

Inverse Document Frequencyred bullChicago BullsRed SocksComputing is too important to be left to menKaren Spärck-Jones

Обратная связь

Если не удалось найти и скачать доклад-презентацию, Вы можете заказать его на нашем сайте. Мы постараемся найти нужный Вам материал и отправим по электронной почте. Не стесняйтесь обращаться к нам, если у вас возникли вопросы или пожелания:

Email: Нажмите что бы посмотреть 

Что такое TheSlide.ru?

Это сайт презентации, докладов, проектов в PowerPoint. Здесь удобно  хранить и делиться своими презентациями с другими пользователями.


Для правообладателей

Яндекс.Метрика